0% found this document useful (0 votes)

24 views17 pages

Module 5 Data Compression KTU

The document discusses audio compression, focusing on concepts such as decibels (dB), which measure sound intensity on a logarithmic scale, and the difference between silence and companding strategies for lossy compression. It explains the human auditory system's frequency sensitivity, psychoacoustic modeling, and various compression methods, including conventional and lossy techniques that leverage human perception. Additionally, it covers the principles of digitization, sampling rates, and encoding methods like μ-law and A-law used in audio compression.

Uploaded by

ADHYA PRATHEESH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views17 pages

Module 5 Data Compression KTU

Uploaded by

ADHYA PRATHEESH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Module 5 Audio Compression.

Define the unit decibles (db) and what aspect of the audio signal does it

measure?

Distinguish between silence and companding strategies for lossy audio

compression.
Audible noise (sound) is generated by physically vibrating material.

Any object, from a guitar string to a human vocal cord, can produce sound when it vibrates.
When an object vibrates, it disturbs the surrounding air molecules, causing them to move
back and forth.

The vibration produces pressure wave in the air, the pressure wave

travel through the air, and ultimately cause our eardrums to vibrate.

The vibration of our ear drums is converted into electrical pulses.

The brain interprets these electrical signals as sound, allowing us to perceive and understand
the auditory information. The brain processes various aspects of the sound, such as its pitch,
volume, and timbre, enabling us to recognize and differentiate between different sounds.
Since sound is a pressure wave, it takes on continuous values

As any other wave, sound has three important attributes, its speed, amplitude, and
period.

▹ Amplitude is the measure of the sound wave's strength or its maximum displacement
from a rest position.

▹ The speed of sound refers to how fast sound waves travel through a medium, such as air,
water, or steel. It's determined by the medium's properties, especially its temperature,
density, and elasticity

Frequency measures how many cycles (vibrations or oscillations) a sound wave completes in one
second. It's expressed in hertz (Hz), where one hertz equals one cycle per second
The speed of sound depends mostly on the medium it passes through, and on the
Temperature

The human ear is sensitive to a wide range of sound frequencies, normally from
about (16 to 20 Hz ) to about ( 20 to 22) KHz, depending on the person’s age and health. This is
the range of audible frequencies.

The sensitivity of the human ear to sound level depends on the frequency.

The range of the human voice is much more limited. It is only from about 500 Hz to about 2 kHz

The period of a sound wave is the time it takes for one complete cycle of the wave to pass a given
point. It's the reciprocal of the frequency, representing the duration of one cycle in seconds

WHAT IS A DECIBEL:

The problem with measuring noise intensity is that the ear is sensitive to a very
wide range of sound levels (amplitudes).(from 1 to 1011)

It is inconvenient to deal with measurements

in such a wide range, which is why the units of sound loudness use a logarithmic scale.

The (base-10) logarithm of 1 is zero, and the logarithm of 1011 is 11.

Using logarithms, we only have to deal with numbers in the range 0 through 11.
we multiply it by 10 or by 20, to get decibel system of
measurement.

The decibel (dB) unit is defined as the base-10 logarithm of the ratio of two physical
quantities whose units are powers The logarithm is then multiplied
by the convenient scale factor 10. (If the scale factor is not used, the result is measured
in units called “Bell”)

Thus, we have

where P1 and P2 are measured in units of power such as watt, joule/sec etc.

The numerator, p1, is the power (in microwatts) of the sound whose intensity level is being
measured.

It is convenient to select as the denominator the number of microwatts that produce the faintest
audible sound This number is shown by experiment to be
10−6 microwatt = 10−12 watt.

e.g. if P1=1Watt

Pr=Sound pressure(P is proportional to the square of the sound pressure Pr)

SPL-Sound pressure level

E.g. Let the sound level initally be 70 Db, What happens if P1 is doubled?

Digitization of Sound

• Analog audio represents sound waves as continuously varying electrical signals

• Digital audio represents sound using discrete numerical values, typically in binary form
(0s and 1s).

• Digitization is the process of representing various types of information in a form that can be

stored and processed by a digital device. It is the combined operations of sampling ,quantization
and encoding also called analog-to- digital (A/D) conversion.

A digital-to-analog converter (DAC) convert the numeric samples

back into voltages that are continuously fed into a speaker

For audio, typical sampling rates are from 8 kHz to 48 kHz. This range is determined by the
Nyquisttheorem.

Audio sampling technique-PULSE CODE MODULATION

Nyquist Theorem

• The Nyquist theorem states how frequently we must sample to be able to recover the original
sound.

For correct sampling we must use a sampling rate equal to at least twice the maximum
frequency content in the signa(or twice the bandwidth of the signal). This rate is called the
Nyquist rate.

• The range of human hearing is typically from 16–20 Hz to 20,000–22,000 Hz, depending
on the person and on age. When sound is digitized at high fidelity, it should therefore be
sampled at a little over the Nyquist rate of 2×22000 = 44000 Hz. This is why high-quality
digital sound is based on a 44,100-Hz sampling rate

The Human Auditory System

The frequency range of the human ear is from about 20 Hz to about 20,000 Hz, but the
ear’s sensitivity to sound is not uniform. It depends on the frequency, and experiments
indicate that in a quiet environment the ear’s sensitivity is maximal for frequencies in the
range 2 KHz to 4 KHz

. The existence of the hearing threshold suggests an approach to lossy audio compression. Just
delete any audio samples that are below the threshold.
If a signal for frequency f is smaller than the hearing threshold at f, it (the signal) should be
deleted.

Psychoacoustic Modelling

. It helps in understanding how humans perceive sound,

Key Concepts in Psychoacoustic Modelling

1. Frequency Masking (Simultaneous Masking)

o When a loud sound (masker) makes a nearby frequency less audible.
o Used in perceptual audio coding (e.g., MP3, AAC) to remove inaudible frequencies.
2. Temporal Masking
o A sound is masked by another that occurs just before or after it.
o Helps reduce data in audio compression without perceptible loss.
3. Critical Bands and the Bark Scale
o The ear perceives sound in frequency bands rather than individual frequencies.
o The Bark scale represents these critical bands, aiding in perceptual audio analysis.

The range of audible frequencies can be partitioned into a number of critical bands
▹ Critical bands are frequency bands within which two tones may interfere with each other
and be perceived as a single auditory event.
The critical band concept helps us understand how the threshold at a specific frequency is
affected by nearby sounds. If a sound occurs within the critical band of a certain frequency, it
has the potential to raise the threshold at that frequency.

Here's a breakdown of how it works:

1. Critical Bands: Each frequency has its own critical band, which is like a listening zone
or range of frequencies that the ear perceives as a group. This critical band widens as
the frequency increases.
2. Threshold Increase: When a sound occurs within the critical band of a particular
frequency, it can raise the threshold or sensitivity level at that frequency. This means
that the ear becomes less sensitive to other sounds in that frequency range because it's
already occupied by a significant sound.
3. Effects of Nearby Sounds: Sounds occurring outside of the critical band of a specific
frequency typically don't affect the threshold at that frequency. However, sounds within
the critical band can mask or obscure other sounds nearby, making them harder to
detect.

The width of a critical band is called its size. The widths of the critical bands

introduce a new unit, the Bark

one Bark is the width (in Hz) of one critical band. The Bark is defined as

In audio compression, knowledge of critical bands is utilized to allocate bits more efficiently.
Rather than allocating the same number of bits to every frequency component, more bits can be
assigned to critical bands with significant audio information while fewer bits are allocated to
less critical bands where the human ear is less sensitive.

Frequency masking (also known as auditory masking)

When two sounds fall within the same critical band, they interfere with each other more than if
they were in separate bands. This interference can result in masking, where one sound makes
another sound harder to hear.

How One Signal Raises the Threshold of Another (Masking Effect)

When a strong (loud) signal is present in a critical band, it raises the threshold of hearing for
other signals in that band. This means weaker signals that would normally be heard become
inaudible. This effect is called simultaneous masking and can occur in two ways:

Upward Masking: A low-frequency (bass) sound masks higher-frequency sounds. This

happens because low-frequency sounds create broader excitation patterns in the cochlea.

Downward Masking: A high-frequency sound slightly masks lower frequencies, though this
effect is weaker.

▹ a louder sound within a critical band can mask a quieter sound occurring nearby in
frequency.
▹ Because of the ear’s limited perception of frequencies, the threshold at a frequency f is
raised by a nearby sound only if the sound is within the critical band of f.
The strong sound source raises the normal threshold in its vicinity with
the result that the nearby sound represented by the arrow at “x”, a sound that would
normally be audible because it is above the threshold, is now masked and is inaudible.

A good lossy audio compression method should identify this case and delete the signals
corresponding to sound “x”, since it cannot be heard anyway.

NOISE REJECTION BY MEANS OF MASKING

a. Signal to Noise Ratio (SNR)

• The ratio of the power of the correct signal and the noise is called the signal to noise ratio
(SNR) — a measure of the quality of the signal.

The SNR is usually measured in decibels (dB),

b) Signal to mask ratio (SMR): The SMR at a given frequency is expressed

as the difference (in dB) between the SPL of the masker and the masking

threshold at that frequency

c.) Mask to noise ratio (MNR): The MNR at a given frequency is expressed

as the difference (in dB) between the masking threshold at that frequency

and the noise level. To make the noise inaudible, its level should be below

the masking threshold i.,e the MNR should be positive.

Temporal masking may occur when a strong sound A of frequency f is preceded or
followed in time by a weaker sound B at a nearby (or the same) frequency. If the time
interval between the sounds is short, sound B may not be audible

Sounds that occur

in an interval around the masking sound (both after and before the masking tone) can be
masked. If the masked sound occurs prior to the masking tone, this is called premasking or
backward masking, and if the sound being masked occurs after the masking tone this
effect is called postmasking or forward masking.

Psychoacoustic Model Summary

A psychoacoustic model is a mathematical model that simulates the human auditory system's
perception of sound. It aims to replicate how humans hear and perceive audio signals, taking
into account factors such as frequency masking, temporal masking, and spatial localization.

The first step in the

psychoacoustic model is to obtain a spectral profile of the signal being encoded. The audio
input is windowed and transformed into the frequency domain using a filter bank or a
frequency domain transform. The Sound Pressure Level (SPL) is calculated for each spectral
band. If the algorithm uses a subband approach, then the SPL for the band is computed from
the SPL for each

OTHER DEFINITIONS:

\
The dynamic range is the ratio of maximum to minimum

absolute values of the signal: Vmax /Vmin.

SOUND COMPRESSION METHODS

▹ Conventional compression methods, such as RLE, statistical, and dictionary-based, can

be used to losslessly compress sound files, but the results depend heavily on the specific
sound.

▹ Better sound compression can be attained by developing lossy methods that take
advantage of our perception of sound, and discard data to which the human ear is not
sensitive

▹ two approaches

▹ silence compression

▹ companding
THIS METHOD USES THE FOLLOWING PARAMETERS:

1. Parameter that specifies the largest sample that should be suppressed.

2. Parameter that specifies the shortest
run-length of small samples, typically 2 or 3.
3. Parameter that specifies the minimum number of consecutive large samples that should
terminate a run of silence
▹ Compression:

▸ In the compression stage, the dynamic range of the signal is reduced.

▸ This is done by applying a non-linear function that reduces the amplitude of the
signal at higher levels while leaving the lower levels relatively unchanged.

▸ By compressing the dynamic range, weaker signals are amplified, and stronger
signals are attenuated, leading to a more uniform signal level.

▸ Because the dynamic range has been reduced, the signal requires less bandwidth
or storage space compared to the original uncompressed signal.

▹ Expansion

▸ At the receiving end, the compressed signal is expanded back to its original
dynamic range.

▸ This is achieved by applying the inverse of the compression function used in the
first stage.

▸ The weaker signals are attenuated, and the stronger signals are amplified,
restoring the original signal's dynamic range.
The mapped 15-bit numbers can be
decoded back into the original 16-bit samples by the inverse formula

Disadv: Reducing 16-bit numbers to 15-bits doesn’t produce much compression.

Hence More sophisticated methods, such as μ-law and A-law, are commonly used and have been
made international standards.

Law Encoders:

▹ Refers to a device or algorithm used in companding systems.

▹ Companding, short for compression-expansion, is a technique used to improve the

signal-to-noise ratio (SNR) of an analog signal by compressing the dynamic range of the
signal before transmission or recording and then expanding it back to its original range
at the receiving end.

▹ a law encoder is responsible for applying a specific mathematical function or encoding

scheme to the input signal before compression.

▹ This encoding scheme determines how the input signal is mapped to the compressed
domain.

▹ The most common law encoders include A-law and μ-law encoding,

▹ it compresses the dynamic range of the input signal logarithmically, but with a different
mathematical function.

▹ μ-law encoding allocates more quantization levels to low-amplitude signals and fewer
levels to high-amplitude signals,

▹ It is defined by a piecewise linear function that maps the input signal to a quantized
output value based on its amplitude
Output is an 8-bit code in the same interval
[−1, +1]. The output is then scaled to the range [−256, +255]. Bigger samples are decoded with
more noise, and smaller samples are decoded with
less noise.

Logarithms are slow to calculate, so the μ-law encoder performs much simpler calculations
that produce an approximation

▹ Here, P represents the sign of the input sample.

▹ Bits S2, S1 and S0 are the segment code.

Q3, Q2 and Q0 are the quantization code

Encoder:

▹ The encoder determines the segment code by adding a bias of 33 to the absolute value of
input sample.

▹ Determining the bit position of the most significant one bit among bits 5 to 12 of the
input and subtracting 5 from that position.

▹ The 4 bit quantization code is set to 4 bits following the bit position determined in step b.

▹ The encoder ignores the remaining bits of the input sample and inverts the code word at
its output.
- A-law Encoding

▹ A-law encoding is a logarithmic companding algorithm used in telecommunications,

particularly in Europe and Japan.

▹ It compresses the dynamic range of the input signal logarithmically, allocating more
quantization levels to low-amplitude signals and fewer levels to high-amplitude signals.

▹ A-law encoding is defined by a nonlinear piecewise function that maps the input signal
to a quantized output value based on its amplitude.

▹ It is characterized by a higher resolution for low-level signals, which improves the

representation of quiet sounds and enhances the signal-to-noise ratio (SNR) of the
transmitted or recorded audio.
The Alaw inputs 13-bit samples
The A-law encoder generates an 8-bit codeword with the same format as the μ-law
encoder.
It sets the P bit to the sign of the input sample.
It then determines the segment code by
1. Determining the bit position of the most significant 1-bit among the seven most
significant bits of the input.
2. If such a 1-bit is found, the segment code becomes that position minus 4. Otherwise,
the segment code becomes zero.

The 4-bit quantization code is set to the four bits following the bit position deter mined in step 1,
or to half the input value if the segment code is zero.
The encoder ignores the remaining bits of the input sample, and it inverts bit P and the even-
numbered bits of the codeword before it is output
The two methods are similar; they differ
mostly in their quantizations (midtread vs. midriser).

M5 Audio
No ratings yet
M5 Audio
32 pages
The Science and Perception of Sound
No ratings yet
The Science and Perception of Sound
6 pages
Lecture 3
No ratings yet
Lecture 3
7 pages
36-Perceptual Coding, MPEG Audio Coding-03!04!2025
No ratings yet
36-Perceptual Coding, MPEG Audio Coding-03!04!2025
57 pages
IMT 2 Tue
No ratings yet
IMT 2 Tue
19 pages
Decibel Measures in Architectural Acoustics
No ratings yet
Decibel Measures in Architectural Acoustics
46 pages
1 Sound & Acoust
No ratings yet
1 Sound & Acoust
8 pages
Audio Compression Techniques
No ratings yet
Audio Compression Techniques
34 pages
Sound Primer for Electronic Music
No ratings yet
Sound Primer for Electronic Music
28 pages
(BG) Characteristics of Sound and Acoustics
No ratings yet
(BG) Characteristics of Sound and Acoustics
7 pages
MPEG Audio Compression Techniques
No ratings yet
MPEG Audio Compression Techniques
68 pages
Understanding Sound Waves and Their Properties
No ratings yet
Understanding Sound Waves and Their Properties
22 pages
Chapter 6 (Audio)
No ratings yet
Chapter 6 (Audio)
29 pages
MC 7
No ratings yet
MC 7
36 pages
Audio Design & Effects Guide
No ratings yet
Audio Design & Effects Guide
11 pages
Chapter 2 SOUND AUDIO Systems
No ratings yet
Chapter 2 SOUND AUDIO Systems
58 pages
DC 17
No ratings yet
DC 17
4 pages
MUTC - Week2 Handout 2 PDF
No ratings yet
MUTC - Week2 Handout 2 PDF
6 pages
Aru 303 Presentation
No ratings yet
Aru 303 Presentation
60 pages
Chapter 2 Sound and Audio
No ratings yet
Chapter 2 Sound and Audio
20 pages
Basics of Sound in Digital Media
80% (5)
Basics of Sound in Digital Media
54 pages
Sound Parameters and Perception Analysis
No ratings yet
Sound Parameters and Perception Analysis
37 pages
Understanding Acoustic Phenomena and Sound
No ratings yet
Understanding Acoustic Phenomena and Sound
23 pages
Understanding Sound Waves and Digital Storage
No ratings yet
Understanding Sound Waves and Digital Storage
12 pages
2 Auditory PDF
No ratings yet
2 Auditory PDF
32 pages
Acoustics and Digital Signal Processing
No ratings yet
Acoustics and Digital Signal Processing
42 pages
Summary
No ratings yet
Summary
22 pages
Audio & PA Systems Guide
No ratings yet
Audio & PA Systems Guide
46 pages
Sounds PDF
No ratings yet
Sounds PDF
5 pages
Understanding the Decibel Scale
No ratings yet
Understanding the Decibel Scale
31 pages
Overview of Speech Enhancement: 3.1 Psychoacoustics
No ratings yet
Overview of Speech Enhancement: 3.1 Psychoacoustics
19 pages
Audiography Notes PDF
100% (2)
Audiography Notes PDF
8 pages
ACOUSTICS
No ratings yet
ACOUSTICS
6 pages
Fourier Sound Analysis Basics
No ratings yet
Fourier Sound Analysis Basics
117 pages
Building Environment 1: Acoustics: David Coley (6E2.22, D.a.coley@bath - Ac.uk)
No ratings yet
Building Environment 1: Acoustics: David Coley (6E2.22, D.a.coley@bath - Ac.uk)
39 pages
Digital Audio Primer
No ratings yet
Digital Audio Primer
90 pages
Perceptual Foundations of Sound
No ratings yet
Perceptual Foundations of Sound
9 pages
Consumer Electronics Lecture 1
No ratings yet
Consumer Electronics Lecture 1
19 pages
C6 - Production Phase - Audio Video Basics - Part 1
No ratings yet
C6 - Production Phase - Audio Video Basics - Part 1
50 pages
Understanding Sound: Physics and Psychology
No ratings yet
Understanding Sound: Physics and Psychology
15 pages
L03 Hearing
No ratings yet
L03 Hearing
10 pages
Audio Compression Using Wavelet Techniques: Project Report
No ratings yet
Audio Compression Using Wavelet Techniques: Project Report
41 pages
Understanding Sound Recording Basics
No ratings yet
Understanding Sound Recording Basics
12 pages
Psycho Acoustics
No ratings yet
Psycho Acoustics
64 pages
Acoustics ARCH 255 - Liapu Wasif 8 10
No ratings yet
Acoustics ARCH 255 - Liapu Wasif 8 10
3 pages
Lecture 5 - Noise Contol Part I-Sound Principles
No ratings yet
Lecture 5 - Noise Contol Part I-Sound Principles
31 pages
Physics of the Ear Lecture Notes
No ratings yet
Physics of the Ear Lecture Notes
25 pages
Dayanand Sagar Acadamy of Technology&Management. Udayapura, Bangalore 560 082
100% (1)
Dayanand Sagar Acadamy of Technology&Management. Udayapura, Bangalore 560 082
30 pages
2.0 Information Sources 2.1 Audio Signals: Dr. Ing. Saviour Zammit
No ratings yet
2.0 Information Sources 2.1 Audio Signals: Dr. Ing. Saviour Zammit
12 pages
Unit 2
No ratings yet
Unit 2
23 pages
Basics of Sound Waves Explained
No ratings yet
Basics of Sound Waves Explained
2 pages
I Ntroducti Ontothestudyofacousti CS
No ratings yet
I Ntroducti Ontothestudyofacousti CS
24 pages
The Technology of Computer Music 1969 PDF
No ratings yet
The Technology of Computer Music 1969 PDF
196 pages
2016 Sound
No ratings yet
2016 Sound
69 pages
Introduction (UCS749)
No ratings yet
Introduction (UCS749)
59 pages
1 by 3 Octave Frequency Chart
100% (1)
1 by 3 Octave Frequency Chart
7 pages
Understanding Softlines in Retail
No ratings yet
Understanding Softlines in Retail
8 pages
Review of Ecodesign Methods and Tools - Barriers and Strategies
No ratings yet
Review of Ecodesign Methods and Tools - Barriers and Strategies
33 pages
FINAL - Series-XIX-A - AIF (Category I and II) Distributors - Version-September 2024
No ratings yet
FINAL - Series-XIX-A - AIF (Category I and II) Distributors - Version-September 2024
277 pages
Top Ten Most Populated Nations 2023
No ratings yet
Top Ten Most Populated Nations 2023
2 pages
Consultants Database .
100% (3)
Consultants Database .
30 pages
1 Service and Maintenance PDF
100% (1)
1 Service and Maintenance PDF
33 pages
Course Outline BS English
No ratings yet
Course Outline BS English
77 pages
2.5 Fuel Oil System
No ratings yet
2.5 Fuel Oil System
14 pages
Automobile Evolution and Societal Impact
No ratings yet
Automobile Evolution and Societal Impact
2 pages
F.A.S. Amps Model Gallery Overview
No ratings yet
F.A.S. Amps Model Gallery Overview
21 pages
MCQs On Vedic Age
No ratings yet
MCQs On Vedic Age
11 pages
ITW Stretch Wrapper Octopus - 1800-2800 - S-SFTS Data Sheet
No ratings yet
ITW Stretch Wrapper Octopus - 1800-2800 - S-SFTS Data Sheet
4 pages
Rassundari Devi's Journey to Literacy
No ratings yet
Rassundari Devi's Journey to Literacy
3 pages
Flexible Learning for Educators
No ratings yet
Flexible Learning for Educators
6 pages
Guide To Scenario Analysis v14
No ratings yet
Guide To Scenario Analysis v14
92 pages
Anonymity of The New Testament History Books PDF
100% (1)
Anonymity of The New Testament History Books PDF
23 pages
ASTM - Residual Stress Effects On Fatigue and Fracture Testing and Incorporation of Results Into Design 2007 - ASTM STP 1497
100% (11)
ASTM - Residual Stress Effects On Fatigue and Fracture Testing and Incorporation of Results Into Design 2007 - ASTM STP 1497
164 pages
Jojoba Oil: A Cosmetic Marvel
No ratings yet
Jojoba Oil: A Cosmetic Marvel
10 pages
Al Mann - Memo Motion PDF
No ratings yet
Al Mann - Memo Motion PDF
20 pages
Quadratic Equation Test
No ratings yet
Quadratic Equation Test
4 pages
PhilGEPS Sworn Statement Hamugaway
No ratings yet
PhilGEPS Sworn Statement Hamugaway
2 pages
Hai 2013 - Comparison of Kinetic Parameters For Phosphatases
No ratings yet
Hai 2013 - Comparison of Kinetic Parameters For Phosphatases
9 pages
AI: Revolutionizing Cybersecurity
No ratings yet
AI: Revolutionizing Cybersecurity
6 pages
CH04 Tutorial Student
No ratings yet
CH04 Tutorial Student
2 pages
Discussion 1
No ratings yet
Discussion 1
15 pages
Ucchista Ganapati Sahasranama
No ratings yet
Ucchista Ganapati Sahasranama
16 pages
Ehrmann Apostolic Fathers
96% (26)
Ehrmann Apostolic Fathers
457 pages
A New Low-Cost Iot Based Monitoring System Design For Stand-Alone Solar Photovoltaic Plant and Power Estimation
No ratings yet
A New Low-Cost Iot Based Monitoring System Design For Stand-Alone Solar Photovoltaic Plant and Power Estimation
13 pages
The Nature and Function of Intuitive Thought, Lauri Järvilehto
100% (1)
The Nature and Function of Intuitive Thought, Lauri Järvilehto
96 pages
Patient Medication Profile
No ratings yet
Patient Medication Profile
4 pages

Module 5 Data Compression KTU

Uploaded by

Module 5 Data Compression KTU

Uploaded by

Module 5 Audio Compression.

Distinguish between silence and companding strategies for lossy audio

The vibration of our ear drums is converted into electrical pulses.

It is inconvenient to deal with measurements

The (base-10) logarithm of 1 is zero, and the logarithm of 1011 is 11.

Pr=Sound pressure(P is proportional to the square of the sound pressure Pr)

• Analog audio represents sound waves as continuously varying electrical signals

A digital-to-analog converter (DAC) convert the numeric samples

Audio sampling technique-PULSE CODE MODULATION

The Human Auditory System

. It helps in understanding how humans perceive sound,

Key Concepts in Psychoacoustic Modelling

1. Frequency Masking (Simultaneous Masking)

Here's a breakdown of how it works:

introduce a new unit, the Bark

Frequency masking (also known as auditory masking)

How One Signal Raises the Threshold of Another (Masking Effect)

Upward Masking: A low-frequency (bass) sound masks higher-frequency sounds. This

NOISE REJECTION BY MEANS OF MASKING

a. Signal to Noise Ratio (SNR)

The SNR is usually measured in decibels (dB),

threshold at that frequency

the masking threshold i.,e the MNR should be positive.

Sounds that occur

Psychoacoustic Model Summary

The first step in the

absolute values of the signal: Vmax /Vmin.

SOUND COMPRESSION METHODS

▹ Conventional compression methods, such as RLE, statistical, and dictionary-based, can

1. Parameter that specifies the largest sample that should be suppressed.

▸ In the compression stage, the dynamic range of the signal is reduced.

Disadv: Reducing 16-bit numbers to 15-bits doesn’t produce much compression.

▹ Refers to a device or algorithm used in companding systems.

▹ Companding, short for compression-expansion, is a technique used to improve the

▹ a law encoder is responsible for applying a specific mathematical function or encoding

▹ Here, P represents the sign of the input sample.

▹ Bits S2, S1 and S0 are the segment code.

Q3, Q2 and Q0 are the quantization code

▹ A-law encoding is a logarithmic companding algorithm used in telecommunications,

▹ It is characterized by a higher resolution for low-level signals, which improves the

You might also like