SISTEM DIGITAL
NIRKABEL
03
Modul ke:
SOURCE CHANNEL CODING – TEKNIK KOMPRESI DATA
TEKNIK
Fakultas
Mudrik Alaydrus
Dian Widi Astuti
Program Studi
MAGISTER
TEKNIK 23 September 2023
ELEKTRO
Pembuka Daftar Pustaka Akhiri Presentasi
Content:
Basics of Source Coding
Implementation
Speech Coding
Audio Coding
Image Coding
Video Coding
Basics of Channel Coding
Block Codes
Convolutional Codes
<
← MENU AKHIRI >
→
Basics of Source Coding
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
Why to compress data ?
• Raw PCM speech (sampled at 8 kbps, represented with 8 bit/sample) has data
rate of 64 kbps
• Speech coding :
Reducing the bit rate of a speech file → carry more voice calls in
a single fiber or cable
necessary for cellular phones (due to limited data rate <=16 kbps)
the lower the bit rate for a voice call, the more other
services (data/image/video) can be accommodated.
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
Example Coding Methods
• “ZIP”: no transformation nor quantization, apply VLC (LZW) to the stream
of letters (symbols) in a file directly, lossless coding
• PCM for speech: no transformation, quantize the speech samples directly
using mu-law quantizer, apply fixed length binary coding
• ADPCM for speech: apply prediction to original samples, the predictor is
adapted from one speech frame to the next, quantize the prediction error,
error symbols coded using fixed length binary Coding
• JPEG for image: apply discrete cosine transform to blocks of image pixels,
quantize the transformed coefficients, code the quantized coefficients
using variable length coding (runlength + Huffman coding)
<
← MENU AKHIRI >
→
Huffman Coding
100010100010
2 02302
<
← MENU AKHIRI >
→
Speech Coding
• Speech coders are lossy coders, i.e. the decoded signal is
different from the original
• Goal: to minimize the distortion at a given bit rate, or
minimize the bit rate to reach a given distortion
• Figure of Merit
– Objective measure of distortion is SNR (Signal to noise ratio)
– SNR does not correlate well with perceived speech quality
• Speech quality is often measured by MOS (mean opinion score)
– 5: excellent
– 4: good
– 3: fair
– 2: poor
– 1: bad
• PCM at 64 kbps with mu-law or A-law has MOS = 4.5 to 5.0
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
Other Speech Coding Standards
• Toll quality speech coder (digital wireline phone)
– G.711 (A-LAW and µ-LAW at 64 kbits/sec)
– G.721 (ADPCM at 32 kbits/ sec)
– G.723 (ADPCM at 40, 24 kbps)
– G.726 (ADPCM at 16,24,32,40 kbps)
• Low bit rate speech coder (cellular phone/IP phone)
– G.728 low delay (16 Kbps, delay <2ms, same or better quality
than G.721)
– G. 723.1 (CELP Based, 5.3 and 6.4 kbits/sec)
– G.729 (CELP based, 8 bps)
– GSM 06.10 (13 and 6.5 kbits/sec, simple to implement, used in
GSM phones)
<
← MENU AKHIRI >
→
Speech vs. Audio Coding
• Speech coding ➔ Targeted for telephony applications
- High rate waveform-based speech coder: for comfortable, natural sound, use
simple predictive coding techniques
- Low rate model-based speech coders: for intelligible speech, sufficient for
communication purposes, use speech-production models
• Audio coding ➔ For high quality production of music (including speech) in
multiple channels
- Music has a much wider bandwidth and multichannels
- Waveform-based to retain the natural sound quality
- Make extensive use of human hearing properties in determining the
quantization levels in different frequency bands
• Each frequency component is quantized with a step-size depending on the
hearing threshold
• Don’t code if the ear cannot hear it!
<
← MENU AKHIRI >
→
MPEG Standards Overview
MPEG: Motion Picture Expert Group
• MPEG-1: Defines coding standards for both audio and video, and how to
packetize the coded audio and video bits to provide time synchronization
– Total rate: 1.5 Mbps
– Video (352x240 pels/frame, 30 frame/s): 30 Mbps -> 1.2 Mbps
– Audio (2 channels, 48 K samples/s, 16 bits/sample):
2*768 kbps -> <=0.3 Mbps
– Applications: web movies, MP3 audio, video CD
• MPEG-2: for better quality audio and video
– Video: 720x480 pels/frame, 30 frames/s: 216 Mbps - > 3-5 Mbps
– Audio (5.1 channels), Advanced audio coding (AAC)
• MPEG-4: targeted for a variety of applications, with wide range of quality and
bit rate, but improved quality mainly at low bit rate
– For internet audio video streaming
<
← MENU AKHIRI >
→
MPEG-1 Audio Layers
• Layer 1: DCT type filter with equal frequency spread per band.
Psychoacoustic model only uses frequency masking.
• Layer 2: Same filter bank as layer 1.
Psychoacoustic model uses a little bit of the temporal masking.
• Layer 3 (MP3): Layer 1 filterbank followed by MDCT per band to obtain
non-uniform frequency division similar to critical bands.
Psychoacoustic model includes temporal masking effects, takes into
account stereo redundancy, and uses Huffman coder.
At the time of MPEG1 audio development (finalized 1992), Layer 3 was considered
too complex to be practically useful.
But today, layer 3 is the most widely deployed audio coding method (known as
MP3), because it provides good quality at an acceptable bit rate.
It is also because the code for layer 3 is distributed freely.
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
3GPP:
• AMR Adaptive Multi-Rate
• AAC Advanced Audio Coding (LTE)
ITU:
• G.711 PCM
• G.722 ADPMC
• G.722.1C ADPMC
IETF (Internet Eng. Task Force):
• Opus
• iLBC (internet low-bit-rate code)
• Speex
• Vorbis
https://opus-codec.org/
<
← MENU AKHIRI >
→
Image Coding Standards
• G3,G4: facsimile standard (1980)
• JBIG/JBIG2: The next generation facsimile standard (1994 --)
• JPEG: For coding still images or video frames. (1992)
• Lossless JPEG: for medical and archiving applications.
• JPEG2000: An improvement to JPEG, yielding better images at lower bit
rate, plus other features (scalability, error resilience)
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
JPEG2000
• Improved coding efficiency
• Full quality scalability
– From lossless to lossy at different bit rate
• Spatial scalability
• Improved error resilience
• Tiling
• Region of interests
• More demanding in memory and computation time
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
How J2K Achieves Scalability?
• Core: Wavelet transform
– Yields a multi-resolution representation of an original image
• Still a transform coder
– Block DCT is replaced by a full frame wavelet transform
• Also known as subband decomposition
– Wavelet coefficients are coded bit plane by bit plane
– Spatial scalability can be achieved by reconstructing from
only low resolution wavelet coefficients
– Quality scalability can be achieved by decoding only
partial sets of bit planes, starting from the most
significant bit plane
<
← MENU AKHIRI >
→
Some Video Applications
• Digital TV/HDTV broadcast over the air/cable/satellite
• Video-on-demand
– On-line video store
– CNN news video
• Internet Video broadcast/multicast
• DVD movies
• Home video capture and sharing
•…
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
HEVC
VVC most newest
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
Key Ideas in Video Coding
• Predict a new frame from a previous frame and only specify the prediction error
(INTER mode)
• Prediction error will be coded using an image coding method (e.g.,DCT-based as
in JPEG)
• Prediction errors have smaller energy than the original pixel values and can be
coded with fewer bits
• Those regions that cannot be predicted well will be coded directly using DCT-
based method (INTRA mode)
• Use motion-compensated temporal prediction to account for object motion
• Use spatial directional prediction to exploit spatial correlation (H.264)
• Work on each macroblock (MB) (16x16 pixels) independently for
reduced complexity
– Motion compensation done at the MB level
– DCT coding of error at the block level (8x8 pixels or smaller)
– Block-based hybrid video coding
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
Channel Coding
Modern information and communication systems are based on the reliable and
efficient transmission of information.
Channels encountered in practical applications are usually disturbed regardless
of whether they correspond to information transmission over noisy and time-
variant mobile radio channels or
to information transmission on optical discs that might be damaged by
scratches.
Owing to these disturbances, appropriate channel coding schemes have to be
employed such that errors within the transmitted information can be detected
or even corrected.
Besides good code characteristics with respect to the number of errors that can
be detected or corrected, the complexity of the architectures used for
implementing the encoding and decoding algorithms is important for
practical applications.
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
<
← MENU AKHIRI >
→
Basic Principle
Encoder
<
← MENU AKHIRI >
→
Channel Transmission
received
Original
00101110
<
← MENU AKHIRI >
→
Taxonomy of Channel Coding Methods
1. Block Codes
Linear (decoding difficult)
Cyclic (special case of linear)
2. Convolutional Codes
3. Turbo Codes (beyond this lecture)
<
← MENU AKHIRI >
→
Fundamental of Block Code
Code rate
k/n
<
← MENU AKHIRI >
→
Cyclic block codes as special case of linear block codes
offer an efficient way for decoding.
Example of cyclic block codes:
Bose-Chaudhuri-Hocquenghem(BCH)
Reed Solomon (RS)
<
← MENU AKHIRI >
→
Convolutional codes
Similar like block code, the basic procedure is adding redundant
Difference to block code is:
convolutional codes have memory
the actual output depends not only on actual input, but also
on previous input
Application area:
Cellular systems (GSM, UMTS)
Dial-up modem
satellite communications
WLAN IEEE 802.11
etc.
<
← MENU AKHIRI >
→
Input u = (1, 1, 0, 1, 0, 0, . . .)
Output b = (1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, . . .).
<
← MENU AKHIRI >
→
output memory
input
memory
<
← MENU AKHIRI >
→
Output b = (1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, . . .).
<
← MENU AKHIRI >
→
Convolutional Coding in Mobile Communications
GSM and UMTS:
Coding of speech (no delay, allows more errors)
➔ convolutional coding
and of data (allows delay, requires low BER)
➔ combination of convolutional coding and
automatic repeat request (ARQ)
<
← MENU AKHIRI >
→
The Full-Rate (FR) codec was the first digital speech coding standard
used in the GSM digital mobile phone system developed
in the late 1980s.
It is still used in GSM networks, but will gradually be replaced by
Enhanced-Full-Rate (EFR) and Adaptive MultiRate (AMR) standards,
which provide much higher speech quality.
With the GSM full-rate codec, the analogue speech signal is usually sampled with
a sampling rate of 8000 samples per second and an 8-bit resolution of the
analogue-to-digital conversion. This results in a data rate of 64 kbps.
The FR speech coder reduces this rate to 13 kbps.
To become robust against transmission errors, these speech-coded data are
encoded with a convolutional code, yielding a transmission rate of 22.8 kbps.
<
← MENU AKHIRI >
→
The output of Codec FR delivers : 13 kbps (13000 bits in second).
I’ m m v 0 ms ➔ 260 bits (a frame) in 20 ms
Class 1a (most important)
Class 1 (protected) 50 bits cyclic and convolutional coded
182 bits
260 bits Class 1b (important)
132 bits convolutional coded
Class 2 (unprotected)
78 bits
The coded:
182 + 3 CRC (Cyclic Redundancy Check) + 4 bits zeros = 189 bits
encoded with R =1/2 ➔ 378 bits
Together with unprotected 78 bits ➔ 456 bits in 20 ms
It makes 22.8 kbps
<
← MENU AKHIRI >
→
The UMTS standard:
speech codec defines coding methods for data rates from 4.75 to 12.2 kbps, where the
12.2 kbps mode is equivalent to the GSM EFR codec.
UMTS employs a powerful convolutional code with memory m = 8 and code rate R =
1/3.
The 12.2 kbps speech coding mode uses 20 ms speech frames with 244 data bits per
frame.
In the UMTS system, those 244 bits are also partitioned into three classes according to
their relevance to the speech decoder. Class A contains the 81 most important bits,
while class B with 103 bits and class C with 60 bits are less vital.
Only the bits of class A are protected with a CRC code. However, this code provides
more reliability as eight redundancy bits are used for error detection.
All three classes are encoded with the same rate R = 1/3 convolutional code, where
each class is encoded separately. Hence, eight tail bits are added for each class. This
results in 828 code bits. Then, puncturing is used to reduce the number of code bits to
688 for each speech frame.
<
← MENU AKHIRI >
→
DVB-T : combination of block codes and convolutional codes
concatenated = cascaded/series
<
← MENU AKHIRI >
→
Terima Kasih
Mudrik Alaydrus
Dian Widi Astuti