0% found this document useful (0 votes)
83 views42 pages

Multimedia Systems: Ch. 1: Audio/Image/Video Fundamentals

This document provides an overview of multimedia fundamentals, including computer representation of audio, digital image representation, and digital video representation. For audio, it discusses quantization, sampling, sampling rate, Nyquist sampling theorem, and pulse code modulation. For images, it covers pixels, color depth, color spaces, and chrominance subsampling. The document also previews chapters on audio and image compression techniques.

Uploaded by

Ezo'nun Babası
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views42 pages

Multimedia Systems: Ch. 1: Audio/Image/Video Fundamentals

This document provides an overview of multimedia fundamentals, including computer representation of audio, digital image representation, and digital video representation. For audio, it discusses quantization, sampling, sampling rate, Nyquist sampling theorem, and pulse code modulation. For images, it covers pixels, color depth, color spaces, and chrominance subsampling. The document also previews chapters on audio and image compression techniques.

Uploaded by

Ezo'nun Babası
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Ch.

1: Audio/Image/Video
Fundamentals
Multimedia Systems

Prof. Thinh Nguyen (based on Prof. Ben Lees


slides)
Oregon State University
School of Electrical Engineering and
Computer Science
Outline
Computer Representation of Audio
Quantization
Sampling
Digital Image Representation
Color System
Chrominance Subsampling
Digital Video Representation
Hardware Requirements

Chapter 1: Audio/Image/Video
Fundamentals 2
Computer Representation of
Audio
Sound is created by vibration of matter (i.e., air
molecules).
Sound is a continuous wave that travels through air:
Amplitude of a sound is the measure of the displacement of air
pressure wave from its mean or quiescent state (measured in
decibels, db)
Frequency represents the number of periods in a second
(measured in hertz, Hz, cycles/second). Period is the reciprocal
value of the frequency.
Period
Amplitude
(Air Pressure)

Time

Chapter 1: Audio/Image/Video
Fundamentals 3
Computer Representation of
Audio
A transducer (inside a microphone)
converts pressure to voltage levels.
Convert analog signal into a digital stream
by discrete sampling.
Discretization both in time and amplitude
(quantization).
In a computer, we sample these values at
intervals to get a vector of values.
A computer measures the amplitude of
the waveform at regular time intervals to
produce a series of numbers (samples).

Chapter 1: Audio/Image/Video
Fundamentals 4
Quantization and Sampling

Sample
Height

Quantization
Samples
1.00
0.75
0.5
0.25

Sampling Rate

Chapter 1: Audio/Image/Video
Fundamentals 5
Sampling Rate
Direct relationship between sampling rate,
sound quality (fidelity) and storage space.
How often do you need to sample a signal
to avoid losing information?
To decide a sampling rate - must be aware of
difference between playback rate and capturing
(sampling) rate.
It depends on how fast the signal is changing. In
reality, twice per cycle (follows from the Nyquist
sampling theorem).
Human hearing frequency range: 20Hz -
20KHz, voice is about 500Hz to 2KHz.
Chapter 1: Audio/Image/Video
Fundamentals 6
Nyquist Sampling Theorem
If a signal f(t) is sampled at regular intervals
of time and at a rate higher than twice the
highest significant signal frequency, then the
samples contain all the information of the
original signal.
Example
Actual playback frequency for CD quality audio is
22,050 Hz
Because of Nyquist Theorem - we need to
sample the signal twice this frequency, therefore
sampling frequency is 44,100 Hz.
Chapter 1: Audio/Image/Video
Fundamentals 7
Quantization
Sample precision - the resolution of a sample
value.
Quantization depends on the number of bits
used measuring the height of the waveform.
16-bit CD quality quantization results in 64K
values.
Audio formats are described by sample rate and
quantization:
Voice quality - 8 bit quantization, 8,000 Hz mono (64
Kbps)
CD quality - 16 bit quantization, 44,100 Hz linear
stereo (705.6 Kbps for mono, 1.411 Mbps for stereo)

Chapter 1: Audio/Image/Video
Fundamentals 8
Signal-to-Noise Ratio
A measure of the quality of the signal. Let Psignal and Pnoise be
the signal power and noise power (variances), respectively
SNR = 10 log10 (Psignal / Pnoise )
Assuming quantization error is uniform, and the variance of signal is
not too large compared to the maximum signal value Vmax, then
each bit adds about 6 dB of resolution! (see accompanied
derivations for details)

Max.
V quantization
Vmax noise,Vnoise
+2N-1

Quantization
2Vmax/2N=
Vmax/2N-1
t
Vsignal(t)
-2N-1
Sampling Rate

Chapter 1: Audio/Image/Video
Fundamentals 9
Pulse Code Modulation (PCM)
The two step process of sampling and
quantization is known as Pulse Code
Modulation.
Based on the Nyquist sampling theorem.
Used in speech and CD encoding.

Chapter 1: Audio/Image/Video
Fundamentals 10
How Are Audio Samples
Represented?
Audio samples are represented as formats
characterized by four parameters:
Sample rate: Sampling frequency
Precision: Number of bits used to store audio samples
Encoding: Audio data representation (compression)
Channel: Multiple channels of audio may be
interleaved at sample boundaries.
PCM-encoded speech (64 Kbps) and music
(1.411 Mbps) strains the bandwidth of Internet,
thus some form of compression is needed!
See Chapter 5: Audio Compression

Chapter 1: Audio/Image/Video
Fundamentals 11
Preview of Chapter 5
Audio samples are encoded (compressed) based on
Non-uniform quantization - humans are more sensitive to
changes in quiet sounds than loud sounds:
-law encoding
Difference encoding
Psychoacoustic Principles - humans do not hear all frequencies
the same way due to Auditory Masking:
Simultaneous masking
Temporal masking
This information is used in MPEG-1 Layer 3, known as
MP3.
Reduces bit rate for CD quality music down to 128 or 112
Kbps.

Chapter 1: Audio/Image/Video
Fundamentals 12
Outline
Computer Representation of Audio
Quantization
Sampling
Digital Image Representation
Color System
Chrominance Subsampling
Digital Video Representation
Hardware Requirements

Chapter 1: Audio/Image/Video Fundamentals 13


Digital Image Representation
An image is a collection of an nm array of picture
elements or pixels.
Pixel representation can be bi-level, gray-scale, or color.
Resolution specifies the distance between points
accuracy.
Intensity/Brightness Level

W B 1 bit
Bi-level

n bits
Gray-scale

G 3 x n bits

Pixel B
Color

Chapter 1: Audio/Image/Video
Fundamentals 14
Pixels
Images are made up of
dots called pixels for
picture elements
The number of pixels
affects the resolution
of the monitor
The higher the resolution,
the better the image quality

Chapter 1: Audio/Image/Video
Fundamentals 15
Color Depth (Pixel Depth)
The amount of information per pixel is known as
the color depth
Monochrome (1 bit per pixel)
Gray-scale (8 bits per pixel)
Color (8 or 16 bits per pixel)
8-bit indexes to a color palette
5 bits for each RGB + 1 bit Alpha (16 bits)
True color (24 or 32 bits per pixel)
RGB (24 bits)
RGB + Alpha (32 bits)

Chapter 1: Audio/Image/Video
Fundamentals 16
Example Color Depth

1-bit depth 4-bit depth

8-bit depth 16-bit depth

Chapter 1: Audio/Image/Video
Fundamentals 17
Color Spaces
A method by which we can specify, create,
and visualize color.
Why more than one color space? Different
color spaces are better for different
applications.
Humans => Hue Saturation Lightness or
Brightness (HSL or HSB)
CRT monitors => Red Green Blue (RGB)
Printers => Cyan Magenta Yellow Black (CMYK)
Compression => Luminance and Chrominance
(YIQ,YUV, YCbCr)
Chapter 1: Audio/Image/Video
Fundamentals 18
Visible Spectrum

440 nm 545 nm 580 nm


Human retina is most sensitive to these wavelengths

Chapter 1: Audio/Image/Video
Fundamentals 19
Color Perception
Luminosity
Sensitivity

Blue Red
Green

400 500 600 700


Wavelength (nm)

Chapter 1: Audio/Image/Video
Fundamentals 20
HSB
Defines the color itself
H
dominant
wavelength
Indicates the degree to which the hue differs
from a neutral gray with the same value (brightness)
S
purity
% white

Indicates the level of illumination


B
Luminance
Intensity of
light

Chapter 1: Audio/Image/Video
Fundamentals 21
RGB Color System
RGB (Red-Green-Blue) is the most widely
used color system.
Represents each pixel as a color triplet in
the form (R, G, B), e.g., for 24-bit color, each
numerical values are 8 bits (varies from 0 to
255).
(0, 0, 0) = black
(255, 255, 255) = white
(255, 0, 0) = red
(0, 255, 255) = cyan
(65, 65, 65) = a shade of gray
Chapter 1: Audio/Image/Video
Fundamentals 22
RGB
RGB is an additive model.
No beam, no light.

Yellow Magenta

Cyan

All 3 beams => white!


Chapter 1: Audio/Image/Video
Fundamentals 23
CMYK Color System
For printing, there is no light source. We see light
reflected from the surface of the paper.
Subtractive color model.

No ink, 100% reflection


Cyan
of light => white!

Magenta Yellow

All 3 colors => black!


But, due to imperfect ink, its usually a muddy brown.
Thats why Black (K) ink is added.
Chapter 1: Audio/Image/Video
Fundamentals 24
YUV Color System
PAL (Phase Alternating Line) standard.
Humans are more sensitive to luminance (brightness)
fidelity than color fidelity.
Luminance (Y) - Encodes the brightness or intensity.
Chrominance (U and V) -Encodes the color information.
YUV uses 1 byte for luminance component, and 4 bits
for each chrominance components.
Requires only 2/3 of the space (RGB = 24 bits), so better
compression! This coding ratio is called [Link] subsampling.
RGB <=> YUV
Y = 0.3R + 0.59G + 0.11B
U = (B-Y) * 0.493
V = (R-Y) * 0.877
Chapter 1: Audio/Image/Video
Fundamentals 25
YCbCr Color System
Closely related to YUV. It is a scaled and shifted
YUV.
Cb (blue) and Cr (red) chrominance.
Used in JPEG and MPEG.
YCbCr <=> RGB
Y = 0.257R + 0.504G + 0.098B + 16
Cb = ((B-Y)/2)+0.5 = - 0.148R - 0.291G + 0.439B + 128
Cr = ((R-Y)/1.6)+0.5 = 0.439R - 0.368G - 0.071B + 128

Chapter 1: Audio/Image/Video
Fundamentals 26
YIQ Color System
Used in NTSC color TV broadcasting. B/W
TV if only Y is used.
YIQ signal
similar to YUV
Y = 0.299R + 0.587G + 0.114B
I = 0.596R - 0.275G - 0.321B
Q = 0.212R -0.528G + 0.311B
Composite signal
All information is composed into one signal.
To decode, need modulation methods for
eliminating interference b/w luminance and
chrominance components.

Chapter 1: Audio/Image/Video
Fundamentals 27
Color Decomposition
RGB CMYK YCbCr YIQ

Red Cyan Y Y

Green Magenta U I

Blue Yellow V Q

Chapter 1: Audio/Image/Video
Fundamentals 28
Chrominance Subsampling
Whats another way to cut chrominance bandwidth in
half?
Use 4-bits per pixel.
Human eye less sensitive to variations in color than in
brightness.
Compression achieved with little loss in perceptual
quality.
Y Cb Cr
[Link]
Horizontal Horizontal factor
sampling (relative to 1st digit,
Horizontal factor except when 0
reference
(relative to 1st digit) horizontal &
vertical)
Chapter 1: Audio/Image/Video Fundamentals 29
[Link] Subsampling
For every 4 luminance samples, take 2 chrominance
samples (subsampling by 2:1 horizontally only).
Chrominance planes just as tall, half as wide.
Reduces bandwidth by 1/3
Used in professional editing (high-end digital video
formats)

Chapter 1: Audio/Image/Video
Fundamentals 30
[Link] Subsampling
For every 4 luminance samples, take 1
chrominance sample (subsampling by 4:1
horizontally only).
Used in digital video.

Chapter 1: Audio/Image/Video
Fundamentals 31
[Link] Subsampling
For every 4 luminance samples, take 1 chrominance
sample (subsampling by 2:1 both horizontally and
vertically).
Chrominance halved in both directions.
Most commonly used.
Three varieties:

JPEG, MPEG-1, MJPEG MPEG-2

Chapter 1: Audio/Image/Video
Fundamentals 32
How Are Images Represented?
A single digitized image of 1024 pixels
1024 pixels, 24 bits per pixels requires
~25 Mbits of storage
~7 minutes to send over a 64 Kbps modem!
~8-25 seconds to send over a 1-3 Mbps cable
modem!
Some form of compression is needed!
See Chapter 2: Compression Basics and
Chapter 3: Image Compression

Chapter 1: Audio/Image/Video
Fundamentals 33
Preview of Chapters 2 and 3
Lossless - no information is lost:
Exploits redundancy
Most probable data encoded with fewer bits
Lossy - approximation of original image
Looks for how pixel values change
Human eye more sensitive to luminance than
chrominance.
Human eye less sensitive to subtle feature of
the image.
JPEG uses both techniques.
Chapter 1: Audio/Image/Video
Fundamentals 34
Outline
Computer Representation of Audio
Quantization
Sampling
Digital Image Representation
Color System
Chrominance Subsampling
Digital Video Representation
Hardware Requirements

Chapter 1: Audio/Image/Video
Fundamentals 35
Digital Video Representation
Can be thought of as a sequence of moving
images (or frames).
Important parameters in video:
Digital image resolution (e.g., nm pixels)
Quantization (e.g., k-bits per pixel)
Frame rate (p frames per second, i.e., fps)
Continuity of motion is achieved at
a minimal 15 fps
is good at 30 fps
HDTV recommends 60 fps!
Chapter 1: Audio/Image/Video
Fundamentals 36
Standard Video Data Formats
National Television System Committee (NTSC)
Set the standard for transmission of analog color
pictures back in 1953!
Used in the US and Japan.
525 lines (480 visible).
Resolution? Not digital, but equivalent to the quality
produced by a 720486 pixels.
30 fps (i.e., delay between frames = 33.3 ms).
Video aspect ratio of 4:3 (e.g., 12 in. wide, 9 in. high)
Other standards:
PAL (Phase Alternating Line): Used in parts of Western
Europe.
SECAM: French Standard

Chapter 1: Audio/Image/Video
Fundamentals 37
HDTV
Advanced Television Systems Committee
(ATSC)
> 1000 lines
60 fps
Resolutions of 19201080 and 1280720
pixels
Video aspect ratio of 16:9
MPEG-2 for video compression
AC-3 (Audio Coding-3) for audio
compression
5.1 channel Dolby surround sound
Chapter 1: Audio/Image/Video
Fundamentals 38
Bandwidth Requirements
NTSC - 720 486 pixels, 30 fps, true color
3 720 486 8 30 = 251,942,400 bps or ~252
Mbps!
With [Link] subsampling
Luminance part: 720 486 8 30 = 83,980,800 bps
Chrominance part: 2 720/2 486 8 30 =
83,980,800 bps
Together ~168 Mbps!
For uncompressed HDTV quality video, BW
requirement is
3 1920 1080 8 60 = 2,985,984 bps or ~3
Gbps!

Chapter 1: Audio/Image/Video
Fundamentals 39
Video Compression
In addition to techniques used in JPEG,
MPEG uses
Spatial redundancy - correlation between
neighboring pixels.
Spectral redundancy - correlation between
different frequency spectrum.
Temporal redundancy - correlation between
successive frames.
See Chapter 5:Video Compression.
What about delay through the network?
See Chapter 6: Multimedia Networking.

Chapter 1: Audio/Image/Video
Fundamentals 40
Outline
Computer Representation of Audio
Quantization
Sampling
Digital Image Representation
Color System
Chrominance Subsampling
Digital Video Representation
Hardware Requirements

Chapter 1: Audio/Image/Video
Fundamentals 41
Hardware Requirements
Multimedia servers
Routers
Multimedia Enhanced PCs
Wireless Mobile devices - Cell Phones,
Pocket PCs, Internet Appliances
See Chapter 7: Multimedia Embedded Systems.

Chapter 1: Audio/Image/Video
Fundamentals 42

You might also like