TE EXTC Data Compression & Cryptography Sem - V
Module - 2
Video and Audio Compression
📽️2.1 Video Compression
🔹 What is Video Compression?
Video compression reduces the size of video files by removing spatial and temporal redundancies. The goal
is to store/transmit video efficiently with minimal loss in visual quality.
Why is video compression needed?
Storage:
Uncompressed video files are extremely large, requiring vast amounts of storage
space. Compression significantly reduces the storage space needed.
Transmission:
Transferring large, uncompressed video files over the internet or other networks is slow and
inefficient. Compression enables faster downloads and streaming.
Playback:
Smaller video files require less processing power to play back, leading to smoother playback,
especially on devices with limited resources or slower internet connections.
🔹 Motion Compensation
This refers to the similarity between consecutive frames in a video. If there is little or no movement
between frames, the compression algorithm can store only the changes between frames, rather than the
entire frame data for each frame.
A technique to reduce temporal redundancy (similarities between frames).
It predicts the current frame using previous/future frames and only encodes the difference
(residual).
Macroblocks (e.g., 16×16 pixels) are compared across frames.
Motion vectors specify how blocks move from one frame to another.
🔹 Spatial Prediction
This refers to the repetition of information within a single frame. For example, if a large area of the frame
is a single color, the compression algorithm can represent that area with a smaller amount of data.
Exploits intra-frame redundancy (within a single frame).
Uses neighboring pixel values in the same frame to predict current pixel values.
Reduces the amount of data to be encoded per frame.
🔹 Temporal Prediction
By Gauri Joshi VPM’s MPCOE, Velneshwar Page 1
TE EXTC Data Compression & Cryptography Sem - V
Uses previous or future frames (I, P, B frames) to predict the current frame.
o I-frames: Intra-coded (keyframes, no prediction).
o P-frames: Predict from previous frames.
o B-frames: Bidirectional prediction from both previous and next frames.
🔹 MPEG-4
A video compression standard from the MPEG family.
Uses DCT (Discrete Cosine Transform), motion compensation, and entropy coding.
Supports object-based compression: encodes individual objects within a scene.
Widely used in internet streaming, video conferencing, and mobile applications.
🔹 H.264 (MPEG-4 Part 10 or AVC – Advanced Video Coding)
🔸 H.264 Encoder:
Steps:
1. Prediction (Intra/Inter)
2. Transform (uses integer approximation of DCT)
3. Quantization
4. Entropy Coding (CAVLC or CABAC)
5. Encodes motion vectors and residuals
🔸 H.264 Decoder:
Steps:
1. Entropy Decoding
2. Inverse Quantization & Transform
3. Motion Compensation / Intra Prediction
4. Reconstruct Frame
Benefits:
o Higher compression efficiency than MPEG-4
o Supports HD and Full HD resolutions
o Widely used in Blu-ray, YouTube, video conferencing
🎧 2.2 Sound & Audio Compression
🔹 Sound & Digital Audio Basics:
Sound: Mechanical wave captured and digitized via sampling.
Digital Audio: Represented as sequences of samples (PCM – Pulse Code Modulation).
Sample rate (e.g., 44.1 kHz) & bit depth (e.g., 16-bit) define quality.
🔹 μ-Law and A-Law Companding
Purpose:
Companding = Compressing + Expanding
Used in telephony to optimize dynamic range of voice signals.
By Gauri Joshi VPM’s MPCOE, Velneshwar Page 2
TE EXTC Data Compression & Cryptography Sem - V
μ-Law (used in North America & Japan)
Non-linear companding, more resolution for soft sounds.
Formula:
F(x)=sgn(x)⋅ln(1+μ∣x∣)ln(1+μ)F(x) = \text{sgn}(x) \cdot \frac{\ln(1 + \mu |x|)}{\ln(1 + \mu)}
A-Law (used in Europe)
Similar to μ-Law but slightly different curve.
Provides better dynamic range uniformity.
🔹 MPEG-4 Audio Layer
Part of MPEG-4 standard supporting various audio types.
Supports:
o Speech coding (CELP, HVXC)
o General audio coding (AAC)
o Structured audio (Synthesis and MIDI-like)
🔹 Advanced Audio Coding (AAC)
Successor to MP3; part of MPEG-4 Audio.
Features:
o Perceptual audio coding
o Frequency domain compression
o Better sound quality at lower bitrates
o Used in YouTube, iTunes, DAB+
AAC Compression Process:
1. Psychoacoustic Model: Removes inaudible parts
2. Filter Bank: Transforms audio into frequency domain
3. Quantization and Coding: Compresses frequency coefficients
4. Bitstream Formatting: Packs all information into a stream
✅ Summary Table:
Topic Key Technique Used In
Motion Compensation Predicts motion between frames Video codecs (H.264)
Temporal Prediction Uses I, P, B frames MPEG, H.264
MPEG-4 Object-based compression Streaming, mobile
H.264 Advanced compression with CABAC/CAVLC HD Video, Blu-ray
μ-Law / A-Law Companding of voice signals Telephony
MPEG-4 Audio Encodes voice/music AAC, structured audio
AAC High-quality audio compression YouTube, iTunes
By Gauri Joshi VPM’s MPCOE, Velneshwar Page 3