0% found this document useful (0 votes)
24 views7 pages

U2 Lesson 4 - Text Sound and Images As Digital Data

This document covers how text, sound, and images are represented as digital data. It explains character encoding for text, the process of digitizing sound through sampling, and how images are composed of pixels with associated metadata. Additionally, it discusses the differences between ASCII and Unicode, the RGB color model, and how digital audio formats like MP3 work.

Uploaded by

robert.saunders
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views7 pages

U2 Lesson 4 - Text Sound and Images As Digital Data

This document covers how text, sound, and images are represented as digital data. It explains character encoding for text, the process of digitizing sound through sampling, and how images are composed of pixels with associated metadata. Additionally, it discusses the differences between ASCII and Unicode, the RGB color model, and how digital audio formats like MP3 work.

Uploaded by

robert.saunders
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unit 2: Digital Information Name: ________________________

Lesson 4: Text, Sound, and Images as Digital Data

Text as Digital Data


• Text is represented as digital data through a process called _______________________________
• Character encoding assigns a unique numerical value (_________________) to each character in a
character set, allowing computers to store and manipulate text in a digital format.
• A character set is a collection of characters, symbols, and glyphs that can be used to represent written
language. Common character sets include ASCII (American Standard Code for Information Interchange)
and Unicode.
• A code point is a numerical value assigned to a specific character in a character set. For example, the
code point for the letter "A" is 65 in ASCII and Unicode.
• Computers use binary digits to store and process data, so to represent text in digital form, each
character's code point is translated into binary.
• Text files are stored in computers as sequences of binary data and the encoding scheme used determines
how the binary data corresponds to characters.

The partial ASCII table below outlines a few common symbols and their decimal, hexadecimal, and binary
representation.

Character Decimal Hex Binary Character Decimal Hex Binary


! 33 21 00100001 A 65 41 01000001
“ 34 22 00100010 B 66 42 01000010
# 35 23 00100011 C 67 43 01000011
$ 36 24 00100100 D 68 44 01000100
% 37 25 00100101 E 69 45 01000101
& 38 26 00100110 F 70 46 01000110
‘ 39 27 00100111 G 71 47 01000111
( 40 28 00101000 H 72 48 01001000
) 41 29 00101001 I 73 49 01001001
* 42 2A 00101010 J 74 4A 01001010
+ 43 2B 00101011 K 75 4B 01001011
, 44 2C 00101100 L 76 4C 01001100
- 45 2D 00101101 M 77 4D 01001101
. 46 2E 00101110 N 78 4E 01001110
/ 47 2F 00101111 O 79 4F 01001111
0 48 30 00110000 P 80 50 01010000
1 49 31 00110001 Q 81 51 01010001
2 50 32 00110010 R 82 52 01010010
3 51 33 00110011 S 83 53 01010011
4 52 34 00110100 T 84 54 01010100
5 53 35 00110101 U 85 55 01010101
6 54 36 00110110 V 86 56 01010110
7 55 37 00110111 W 87 57 01010111
8 56 38 00111000 X 88 58 01011000
9 57 39 00111001 Y 89 59 01011001
: 58 3A 00111010 Z 90 5A 01011010
To store the four-character string "FUN!" in memory, the computer would store the binary representation of
each individual character:

Character F U N !

Binary
Representation

• How does the computer know that 0010 0001 represents a ! and not the integer value 33? ____________
• Computers rely on context to determine how binary data should
be interpreted.
• If you are working with a text document, a word processing
software, or any application that deals with textual data, the
computer will use the appropriate character encoding (such as
ASCII, Unicode, etc.) to interpret the binary data as characters.
• In programming languages, you explicitly declare the data type
when using binary values. For example, in many programming
languages, if you write char c = 0b00100001;, it indicates
that you are assigning a binary value to a character variable. The
language's compiler or interpreter knows to interpret the binary
value as a character.

ASCII vs Unicode
• Unicode is more commonly used now instead of ASCII because it provides ______________________
that can encompass characters from multiple languages, scripts, and writing systems.
• While ASCII is limited to representing characters in the English language and a few basic symbols,
Unicode includes a comprehensive collection of characters from virtually all the world's languages,
making it more versatile and suitable for global communication and software development
• Unicode a __________ representation instead of the ___________ representation of ASCII.

Programs and Systems that Use Unicode:


• Browsers like Chrome, Firefox, Safari, and Edge use Unicode to render web pages in different
languages.
• Most modern operating systems, including Windows, macOS, and Linux, use Unicode for file names,
user interface elements, and text processing.
• Software like Microsoft Word, Google Docs, LibreOffice, and Notepad++ use Unicode to handle
multilingual documents.

In contrast, ASCII is still used in situations where only basic English characters and symbols are needed, such
as older systems, legacy software, and certain programming tasks that don't involve internationalization.
Sound as Digital Data
Most of us have had the experience of playing MP3 sound files, upload pictures, or streaming the latest Netflix
film. How does this information get represented on a computer, when a computer can only communicate and
store data as 1s and 0s?

• Sound is _______________ data; it has a continuous and natural


form. How do we represent analog data as a digital data?
• Sound is a form of energy that travels in waves through the air (or
other materials)
• When you talk, listen to music, or any other sound, what you're
actually hearing are these waves vibrating through the air
• The computer needs to break down this sound into tiny pieces that
it can work with.
• It does this by taking snapshots of the sound's ________________
(how high or low the sound is at a specific moment) at regular intervals.

For each snapshot, the computer assigns a binary number based on the amplitude. Let's say the computer uses 8
bits for each number. Here's a simplified example:

• If the amplitude is low, it might be represented as 00000000


• If the amplitude is higher, it might be represented as 01010101
• And if the amplitude is even higher, it might be represented as 11111111

These binary numbers are like a digital version of the sound wave. The computer does this many, many times
per second to capture the entire sound.

• The process of taking those snapshots is called _______________


• The number of snapshots taken per second is called the __________________
• A higher sampling rate means the sound is captured in more detail
• The number of bits used to represent each snapshot is called the ________________
• The higher the bit depth, the more amplitude values per sample are captured to recreate the original
audio signal. The most common audio bit depths are 16-bit, 24-bit, and 32-bit.

When you play the sound back, the computer reads those binary numbers, converts them back to their original
amplitudes, and plays them through speakers or headphones. Because the computer is doing this incredibly
quickly, your ears perceive it as continuous sound.
The most widely used digital audio format is MP3, which samples sound signals at a rate of 44,100
sample/second (44.1 kHz), using 16 bits per sample.

Images as Digital Data


• A digital image is made up of tiny dots called
___________ (short for "picture elements")
• Each pixel is a small square or dot of color
that, when combined with other pixels, form a
complete image
• Think of an image as a grid where each grid
cell represents a pixel
• Screen resolution is the number of pixels on a
device within each dimension (width × height) that can be displayed on the screen. For example, a
device with the resolution of “1024 × 768” has a 1024-pixel width and a 768-pixel height.
The average human eye cannot accurately discern components closer together than about 0.05-0.1 mm, so if the
pixels are ______________ enough, they appear to the human eye as a continuous image. A high-quality digital
camera stores about 10 – 15 million pixels per photograph. The averages out to roughly 0.02 mm together.

So how do we encode these pixels into binary representation?

Black and White Images


• Let's look at a simple example of a black and white image.
• White is represented by a single 1 (since that means light "on") and black is represented as a single 0
(since that means light "off").

Given the following binary representation, what image does this create? Draw some ideas below.

0111010101110111010101110
Was that easy or difficult to create? What information do you also have to know?

• This is where ___________________ comes in.


• Metadata is data that describes other data.
• For example, a digital image may include metadata that describe the size of the image, number of colors,
or resolution.
• Metadata is usually included at the beginning of the coding scheme to let the computer know how to set
up the photo.

• For example, the first two bytes of an image can be used to represent the width and height of the it (1
byte per information).
• The standard varies and this is where the layer of abstraction comes in - know that the metadata is there,
but you do not have to read it or know exactly what it contains.

Let's take the example we just had and include two bytes containing the dimensions at the beginning: width x
height, followed by the black and white color scheme:

0000 0101 0000 0101 0111010101110111010101110

It is much easier to code when we the know the metadata!

Color Images
• Instead of just representing a black and white image, let's kick it up a notch and represent a grayscale
image.
• Instead of using 1 bit for color, we can use 3 bits per pixel color and have _______ shades of intensity.

Color

Binary
000 001 010 011 100 101 110 111
Representation
Decimal
0 1 2 3 4 5 6 7
Representation

We can see from here that if we wanted more colors, all we would have to do it increase the number of bits in
each pixel!
Here would be an example of how we can encode that metadata:

The first byte represents the width, the second byte represents the height, and the third byte represents how
many _________________. The rest of the data is coding the grayscale colors.

0000 0101 0000 0101 0000 0011 011 101 010 111 011 101 010 111 … <rest of greyscale code>

How do we get more colors?

Option 1
• Map the colors to a decimal representation
• This is what we did in the grayscale example and also what we do with characters in a text (Unicode)

Problem: There are SO MANY colors on the visible spectrum; how can we possibly have a map big enough?

Option 2
• Mix __________ to create colors
• The visible spectrum is made up of different light wavelengths
• Mixing wavelengths is how we get different colors

The way color is represented in a computer is different from the ways we represent text or numbers.
• With text, we just made a list of characters and assigned a number to each one.
• With color, we actually use binary to encode _________________________________ in the RGB
model should get.

The _________ (Red-Green-Blue) encoding scheme describes a specific color by capturing the individual
contribution to a pixel's color of each of the three light colors.

___________________________ are the primary colors of light.


• These are used by television and computer screens, because such
things emit light. By combining red, green and blue light in various
proportions, you can make pretty much any color you can think of.
• This is based on our having three types of cone cells in our eyes,
which are sensitive to different wavelengths of light and together
produce the sensation that we know as color.

____________________________ are the primary colors of pigment.


• These are used in printing because inks and such are essentially subtractive - they absorb light rather
than emitting it.
• Cyan ink absorbs red light, magenta ink absorbs green light, and yellow ink absorbs blue light.
_______________________________ are not actually primary colors, although they are often taught as such in
elementary school. The reason they are taught is a combination of:
• it is a scheme that has been used by artists since the late 16th or early 17th century (long before RGB or
CMY came along), as a small set of pigments from which they can make a wide variety of other colors
rather than having to fork out for a wider range of them, and the tradition has continued
• it is an approximation of CMY that is easier for children to understand

Each pixel contains three colors codes: Red, Green, and Blue
Each of the colors gets a number that describes how much each color will contribute to the overall color pixel.
• There is one byte for each color, which will range in decimal from 0 to 255.
• ______ means there is no contribution from this color
• ______ means a full contribution of the color
• Each pixel therefore has 3 bytes of data
• Hexadecimal is more convenient to represent colors, so you will often see a hex code communicating
the unique color combination, which abbreviates the 3 colors

Hexadecimal Code
Decimal
Color Binary Each color contribution
(red, green, blue)
is 2 symbols

White 255, 255, 255 111111111111111111111111 #FFFFFF


Gray 127, 127, 127 100110011001100110011001 #999999
Black 0, 0, 0 000000000000000000000000 #000000
Red 255, 0, 0 111111110000000000000000 #FF0000
Green 0, 255, 0 000000001111111100000000 #00FF00
Blue 0, 0, 255 000000000000000011111111 #0000FF
Yellow 255, 255, 0 111111111111111100000000 #FFFF00
Cyan 0, 255, 255 000000001111111111111111 #00FFFF
Magenta 255, 0, 255 111111110000000011111111 #FF00FF
Orange 255, 153, 0 111111111001100100000000 #FF9900
Pink 255, 170, 170 111111111010101010101010 #FFAAAA
Purple 170, 0, 170 101010100000000010101010 #AA00AA

• Using 3 bytes of information per pixel – 24 bits – allows us to represent 224 distinct colors (about _____
million)
• This is standard in a _________ file format, and since it's estimated that the average human can
distinguish between about 1 million to 10 million different colors, this provides a great file format for
digital images.
• Newer high-resolution video standards are increasing that number: HDMI allows pixel bit depths of up
to 48 bits, allowing for 248 district colors, over 280 trillion!

You might also like