0% found this document useful (0 votes)
31 views55 pages

02 RepresentingInfo

The document discusses the fundamental concepts of computer systems organization, focusing on binary representation, data encoding, and the significance of bits in digital computing. It explains various number representations, data sizes, and how integers are encoded in C, including two's complement and unsigned values. Additionally, it covers bit-level operations, boolean algebra, and the differences between logical and bitwise operations.

Uploaded by

templlm1994
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views55 pages

02 RepresentingInfo

The document discusses the fundamental concepts of computer systems organization, focusing on binary representation, data encoding, and the significance of bits in digital computing. It explains various number representations, data sizes, and how integers are encoded in C, including two's complement and unsigned values. Additionally, it covers bit-level operations, boolean algebra, and the differences between logical and bitwise operations.

Uploaded by

templlm1994
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Computer Systems Organization

Topic 2
Based on chapter 2 from Computer Systems
by Randal E. Bryant and David R. O’Hallaron
Everything is bits
• Each bit is 0 or 1 (binary digits)
• Form basis of digital revolution
• Why bits? Electronic Implementation
– Storing/performing computations is simple/reliable
– Easy to store with bistable elements
– Reliably transmitted on noisy and inaccurate wires
0 1 0

1.1V
0.9V

0.2V
0.0V
Everything is bits
• Single bit may not be useful but bit patterns do
(groups of bits)
• 3 important representations of numbers
– Unsigned encodings based on binary notation
– Two’s complement encoding to represent signed
integers
– Floating point encoding are base-2 version for
representing real numbers
• Limited number of bits to encode numbers can
have surprising effects e.g., 200 * 300 * 400 * 500
yields -884901888 (32 bit representation)
Binary representation
• Base 2 Number Representation
– Represent 1521310 as 111011011011012
– Represent 1.2010 as
1.0011001100110011[0011]…2
– Represent 1.5213 X 104 as 1.11011011011012 X
213
How to convert
• 11 = (1011)₂ = 2^3*1 + 2^2*0 + 2^1*1 + 2^0*1
• 11/2 = 5 (1)
• 5/2 = 2 (1)
• 2/2 = 1 (0)
• 1/2 = 0 (1)

• 12 = (1100)₂ = 2^3*1 + 2^2*1 + 2^1*0 + 2^0*0


• 12/2 = 6(0)
• 6/2 = 3(0)
• 3/2 = 1(1)
• 1/2 = 0(1)
How to convert (fraction)
• Convert 0.8125
• .8125 * 2 = 1.6250 (1)
• .6250 * 2 = 1.250 (1)
• .250 * 2 = 0.5 (0)
• .5 * 2 = 1.0 (1)
• 0 * 2 = 0 (0)
• Soln: 0.11010
• Converting back: 1*2^-1 + 1*2^-2 + 0 + 1*2^-3 +
0 = .5 + .25 + 0 + .0625 + 0 = 0.8125
Encoding Byte Values
• Byte = 8 bits 0 0 0000
1 1 0001
– Binary 000000002 to 111111112 2 2 0010
3 3 0011
– Decimal: 010 to 25510 4 4 0100
5 5 0101
– Hexadecimal 0016 to FF16 6 6 0110
7 7 0111
• Base 16 number representation 8 8 1000
9 9 1001
• Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’ A 10 1010
B 11 1011
• Write FA1D37B16 in C as C 12 1100
D 13 1101
– 0xFA1D37B E 14 1110
– 0xfa1d37b F 15 1111
Hex Notation
• 314156 = 19634.16 + 12
• 19634 = 1227.16 + 2
• 1227 = 76.16 + 11
• 76 = 4.16 + 12
• 4 = 0.16 + 4
• Hence 0x4CB2C
• What is binary and hex notation for 158 ?
• What is 0x605c + 0x5 ?
• Please practice…
Data sizes
• Each computer has word size indicating nominal size
of pointer data
• Virtual address is encoded by such a word, hence
word size determines size of virtual address space
• For machine with w-bit word size, virtual address can
range from 0 to (2^w)-1, giving program access to at
most 2^w bytes
• 32 bit word size limits virtual address space to 4GB
[4*10^9 bytes] while 64 bit leads to 16 exabytes i.e.,
1.84 * 10^19 bytes
• Most 64-bit machines can run programs compiled to
use for 32-bit machines i.e., backward compatible
• 32 bit vs. 64 bit programs [rather than machine
distinction lies in how the program is compiled]
Typical Data Representations in C
C Data Type Typical 32-bit Typical 64-bit
Most data types encode
signed values unless
char 1 1 prefixed by unsigned
short 2 2
int 4 4 Exception for char – need
to declare signed char
long 4 8
float 4 4
double 8 8
long double − −

pointer 4 8
Addressing and Byte Ordering
• In almost all machines, a multi-byte object stored as
a contiguous sequence of bytes
– E.g., variable x of type int has address 0x100 i.e., address
expression &x is 0x100. Assuming int is 4 bytes, x would
be stored in location 0x100, 0x101, 0x102 and 0x103.
• Two notations – big endian vs. little endian
• Number 0x01234567 stored as {01}{23}{45}{67} in
big endian notation while stored as {67}{45}{23}{01}
in little endian notation for the addresses 0x100,
0x101, 0x102, 0x103.
• Most intel compatible machines are little endian
while most IBM/Oracle machines are big endian
Representing code
• Consider the following C function:
int sum(int x, int y) {
Return x + y;
}
• Following machine code generated when compiled
• Linux 32: 55 89 e5 8b 45 0c 03 45 08 c9 c3
• Windows: 55 89 e5 8b 45 0c 03 45 08 5d c3
• Instruction codings can be different – binary code is
seldom portable across combinations of machines +
OS
• From machine perspective – program is simply a
sequence of bytes and has no/minimal information
of original source program
Boolean Algebra
• Developed by George Boole in 19th Century
– Algebraic representation of logic
• Encode “True” as 1 and “False” as 0
And Or
◼ A&B = 1 when both A=1 and B=1 ◼ A|B = 1 when either A=1 or B=1

Not Exclusive-Or (Xor)


◼ ~A = 1 when A=0 ◼ A^B = 1 when either A=1 or B=1, but not both
General Boolean Algebras
• Operate on Bit Vectors (strings of 0’s and 1’s of
fixed length w) - operations applied bitwise
01101001 01101001 01101001
& 01010101 | 01010101 ^ 01010101 ~ 01010101
01000001
01000001 01111101
01111101 00111100
00111100 10101010
10101010

• All of the Properties of Boolean Algebra Apply


• Let a and b denote bit vectors [a_w-1, a_w-2,
…, a_0] and [b_w-1, b_w-2, …, b_0].
• a&b would be a bit vector of length w, where
the ith element would be a_i&b_i
Representing & Manipulating Sets
• Representation
– Can encode any subset {0, …, w–1} with a bit vector [a_w-1,
a_w-2, …, a_0]
– represents subsets of Width w
– aj = 1 if j ∈ A

• 01101001 { 0, 3, 5, 6 }
• 76543210

• 01010101 { 0, 2, 4, 6 }
• 76543210
• Operations
– & Intersection 01000001 { 0, 6 }
– | Union 01111101 { 0, 2, 3, 4, 5, 6 }
– ^ Symmetric difference 00111100 { 2, 3, 4, 5 }
– ~ Complement 10101010 { 1, 3, 5, 7 }
Bit-Level Operations in C
• Operations & (AND), | (OR), ~ (NOT), ^
(Exclusive-OR) Available in C
– Apply to any “integral” data type
• long, int, short, char, unsigned
– View arguments as bit vectors
• Examples (Char data type)
– ~0x41 is 0xBE
• ~010000012 is 101111102
– ~0x00 is 0xFF
• ~000000002 is 111111112
– 0x69 & 0x55 is 0x41
• 011010012 & 010101012 is 010000012
– 0x69 | 0x55 is 0x7D
• 011010012 | 010101012 is 011111012
Contrast: Logic Operations in C
• Contrast to Logical Operators
– && (AND), || (OR), ! (NOT)
• View 0 as “False”
• Anything nonzero as “True”
• Always return 0 or 1
• Watch out
Early termination for && vs. & (and || vs.
• Examples (char|)… data type)vs. bit level operators
logical
– !0x41 is 0x00
– !0x00 is 0x01
one of the more common oopsies in
– !!0x41 is 0x01 C programming
– 0x69 && 0x55 is 0x01
– 0x69 || 0x55 is 0x01
– p && *p - avoids null pointer access since logical operators
do not evaluate second argument if result can be
determined with first
– Similarly a && 5/a will never cause a division by 0
Shift Operations
• Left Shift: x << y
Argument x 01100010
– Shift bit-vector x left y positions
– Throw away extra bits on left << 3 00010000
• Fill with 0’s on right Log. >> 2 00011000
• Right Shift: x >> y Arith. >> 2 00011000
– Shift bit-vector x right y positions
• Throw away extra bits on right
Argument x 10100010
– Logical shift
• Fill with 0’s on left << 3 00010000

– Arithmetic shift Log. >> 2 00101000


• Replicate most significant bit on left Arith. >> 2 11101000
• Undefined Behavior
– Shift amount < 0 or ≥ word size
Integer Representations: Encoding
Integers
Unsigned Two’s Complement
w−1 w−2
 xi 2  xi 2
i w−1 i
B2U(X ) = B2T (X ) = − xw−1 2 +
i=0 i=0

short int x = 15213;


short int y = -15213; Sign
Bit
• C short 2 bytes long
Decimal Hex Binary
x 15213 3B 6D 00111011 01101101
y -15213 C4 93 11000100 10010011

• Sign Bit
– For 2’s complement, most significant bit indicates sign
• 0 for nonnegative
• 1 for negative
Two’s-complement Encoding Example
x = 15213: 00111011 01101101
y = -15213: 11000100 10010011

Weight 15213 -15213


1 1 1 1 1
2 0 0 1 2
4 1 4 0 0
8 1 8 0 0
16 0 0 1 16
32 1 32 0 0
64 1 64 0 0
128 0 0 1 128
256 1 256 0 0
512 1 512 0 0
1024 0 0 1 1024
2048 1 2048 0 0
4096 1 4096 0 0
8192 1 8192 0 0
16384 0 0 1 16384
-32768 0 0 1 -32768
Sum 15213 -15213
Numeric Ranges
• Unsigned Values
• Two’s Complement Values
– UMin = 0
– TMin = –2w–1
000…0
100…0
– UMax = 2w –1
– TMax = 2w–1 – 1
111…1
011…1
• Other Values
– Minus 1
111…1
Values for w = 16 bits
Decimal Hex Binary
UMax 65535 FF FF 11111111 11111111
TMax 32767 7F FF 01111111 11111111
TMin -32768 80 00 10000000 00000000
-1 -1 FF FF 11111111 11111111
0 0 00 00 00000000 00000000
Values for Different Word Sizes
W
8 16 32 64
UMax 255 65,535 4,294,967,295 18,446,744,073,709,551,615
TMax 127 32,767 2,147,483,647 9,223,372,036,854,775,807
TMin -128 -32,768 -2,147,483,648 -9,223,372,036,854,775,808

• Observations  C Programming
▪ #include <limits.h>
– |TMin | = TMax + 1
▪ Declares constants, e.g.,
• Asymmetric range
▪ ULONG_MAX
– UMax = 2 * TMax + ▪ LONG_MAX
1 ▪ LONG_MIN
▪ Values platform specific
Unsigned & Signed Numeric Values
X B2U(X) B2T(X) • Equivalence
0000 0 0 – Same encodings for
0001 1 1 nonnegative values
0010 2 2 • Uniqueness
0011 3 3
– Every bit pattern represents
0100 4 4
unique integer value
0101 5 5
0110 6 6 – Each representable integer
0111 7 7
has unique bit encoding
1000 8 –8 •  Can Invert Mappings
1001 9 –7 – U2B(x) = B2U-1(x)
1010 10 –6 • Bit pattern for unsigned
1011 11 –5 integer
1100 12 –4 – T2B(x) = B2T-1(x)
1101 13 –3 • Bit pattern for two’s comp
1110 14 –2 integer
1111 15 –1
Mapping Between Signed & Unsigned

Two’s Complement Unsigned


T2U
x T2B B2U ux
X

Maintain Same Bit Pattern

Unsigned Two’s Complement


U2T
ux U2B X B2T x

Maintain Same Bit Pattern

• Mappings between unsigned and two’s complement numbers:


Keep bit representations and reinterpret
Mapping Signed  Unsigned
Bits Signed Unsigned
0000 0 0
0001 1 1
0010 2 2
0011 3 3
0100 4 4
0101 5 T2U 5
0110 6 6
U2T
0111 7 7
1000 -8 8
1001 -7 9
1010 -6 10
1011 -5 11
1100 -4 12
1101 -3 13
1110 -2 14
1111 -1 15
Mapping Signed  Unsigned
Bits Signed Unsigned
0000 0 0
0001 1 1
0010 2 2
0011 3 3
0100 4
= 4
0101 5 5
0110 6 6
0111 7 7
1000 -8 8
1001 -7 9
1010 -6 10
+/- 16
1011 -5 11
1100 -4 12
1101 -3 13
1110 -2 14
1111 -1 15
Relation between Signed & Unsigned

Two’s Complement Unsigned


T2U
x T2B B2U ux
X

Maintain Same Bit Pattern

w–1 0
ux + + + ••• + + +
x - + + ••• + + +

Add 2^w if x < 0,


Large negative weight otherwise remains same
becomes
Large positive weight
Conversion Visualized
• 2’s Comp. → Unsigned UMax
– Ordering Inversion UMax – 1

– Negative → Big Positive


TMax + 1 Unsigned
TMax TMax Range

2’s Complement 0 0
Range –1
–2

TMin
Signed vs. Unsigned in C
• Constants
– By default are considered to be signed integers
– Unsigned if have “U” as suffix
0U, 4294967259U
• Casting
– Explicit casting between signed & unsigned same as U2T
and T2U
int tx, ty;
unsigned ux, uy;
tx = (int) ux;
uy = (unsigned) ty;

– Implicit casting also occurs via assignments and procedure


calls
tx = ux; /* cast to signed */
uy = ty; /* cast to unsigned */
Casting Surprises
• Expression Evaluation
–If there is a mix of unsigned and signed in single
expression,
signed values implicitly cast to unsigned
–Including comparison operations <, >, ==, <=, >=
–Examples for W = 32: TMIN = -2,147,483,648 ,
TMAX = 2,147,483,647
Casting Surprises
Constant1 Constant2 Relation Evaluation

0 0U == Unsigned

-1 0 < Signed

-1 0U > Unsigned

2147483647 -2147483647-1 > Signed

2147483647U -2147483647-1 < Unsigned

-1 -2 > Signed

(unsigned)-1 -2 > Unsigned

2147483647 2147483648U < Unsigned

2147483647 (int) 2147483648U > signed


Summary: Casting Signed
Unsigned: Basic Rules
• Bit pattern is maintained
• But reinterpreted
• Can have unexpected effects: adding or
subtracting 2w
• Expression containing signed and unsigned int
– int is cast to unsigned!!
Sign Extension
• Task:
– Given w-bit signed integer x
– Convert it to w+k-bit integer with same value
• Rule:
– Make k copies of sign bit:
– X  = xw–1 ,…, xw–1 , xw–1 , xw–2 ,…, x0
w
k copies of MSB
X •••

•••

X ••• •••
k w
Sign Extension Example
short int x = 15213;
int ix = (int) x;
short int y = -15213;
int iy = (int) y;

Decimal Hex Binary


x 15213 3B 6D 00111011 01101101
ix 15213 00 00 3B 6D 00000000 00000000 00111011 01101101
y -15213 C4 93 11000100 10010011
iy -15213 FF FF C4 93 11111111 11111111 11000100 10010011

• Converting from smaller to larger integer data


type
• C automatically performs sign extension
Truncation of number
• When truncating a w bit number to a k-bit
number, we drop the high order w-k bits
• Result: x’ = x mod 2^k
• While similar property holds for twos-
complement, it converts the most significant bit
into a sign bit
int x = 53191
short sx = (short) x /* -12345 */
int y = sx; /* -12345 */
Truncation of number
• Summary:
• B2Uk([xk-1, xk-1, …, x0]) = B2Uk([xw-1, xw-2, …,
x0]) mod 2^k
• B2Tk([xk-1, xk-1, …, x0]) = U2Tk(B2U([xw-1, xw-2,
…, x0]) mod 2^k)
Summary: Expanding, Truncating: Basic
Rules
• Expanding (e.g., short int to int)
– Unsigned: zeros added
– Signed: sign extension
– Both yield expected result

• Truncating (e.g., unsigned to unsigned short)


– Unsigned/signed: bits are truncated
– Result reinterpreted
– Unsigned: mod operation
– Signed: similar to mod
– For small numbers yields expected behavior
Unsigned Addition
u •••
Operands: w bits
+ v •••
True Sum: w+1 bits u+v •••

Discard Carry: w bits UAddw(u , v) •••

• Standard Addition Function


– Ignores carry output
• Implements Modular Arithmetic
s = UAddw(u , v) = u + v mod 2w
Visualizing (Mathematical) Integer
Addition
• Integer Addition Add4(u , v)

Integer Addition
–4-bit integers u,
v
–Compute true 32
28

sum Add4(u , v) 24
20

–Values increase 16
12 12
14

linearly with u 8
4
8
10

6
v
and v 0
0
2 2
4

–Forms planar
6
u 8
10
12
0
14

surface
Visualizing Unsigned Addition
• Wraps Around Overflow
– If true sum ≥ 2w
UAdd4(u , v)
– At most once
– Decrements by 2w
True Sum 16
14
2w+1
Overflow 12
10
8
2w 6 12
14

4 10
8
2
6
v
0 0 4
Modular Sum 0
2
4 2
6
u 8
10
12
0
14
Two’s Complement Addition
Operands: w bits u •••
+ v •••
True Sum: w+1 bits u+v •••
Discard Carry: w bits TAddw(u , v) •••

• TAdd and UAdd have Identical Bit-Level Behavior


– Signed vs. unsigned addition in C:
int s, t, u, v;
s = (int) ((unsigned) u + (unsigned) v);
t = u + v
– Will give s == t
TAdd Overflow
True Sum
• Functionality
0 111…1 2w–1
– True sum PosOver TAdd Result
requires w+1 0 100…0 2w –1–1 011…1

bits
0 000…0 0 000…0
– Drop off MSB
– Treat remaining 1 011…1 –2w –1 100…0

bits as 2’s comp. 1 000…0 NegOver


–2w
integer
Two’s Complement Addition
• In summary, subtract 2^w if positive overflow
• Add 2^w if negative overflow
• No changes if 2^(w-1) <= sum < 2^(w-1)
• For w = 4 bits,
• -8 [1000] + -5 [1011] = -13 [10011] = 3 [0011]
• 5 [0101] + 5 [0101] = 10 [01010] = -6 [1010]
Visualizing 2’s Complement Addition
NegOver

• Values
– 4-bit two’s comp. TAdd4(u , v)

– Range from -8 to
+7
• Wraps Around 8
6

– If sum  2w–1 4
2

• Becomes 0
6

negative -2
-4 2
4

• At most once -6
-2
0

– If sum < –2w–1 v


-8 -4
-8
-6 -6
-4

• Becomes positive
-2
0 -8
2
u 4
6 PosOver
• At most once
Two’s Complement Negation
• Complement the bits, increment the result by 1
• 0101 [5] → 1010 [-6] → 1011 [-5]
• 1000 [-8] → 0111 [7] → 1000 [-8]
• …
Multiplication
• Goal: Computing Product of w-bit numbers x, y
– Either signed or unsigned
• But, exact results can be bigger than w bits
– Unsigned: up to 2w bits
• Result range: 0 ≤ x * y ≤ (2w – 1) 2 = 22w – 2w+1 + 1
– Two’s complement min (negative): Up to 2w-1 bits
• Result range: x * y ≥ (–2w–1)*(2w–1–1) = –22w–2 + 2w–1
– Two’s complement max (positive): Up to 2w bits, but only
for (TMinw)2
• Result range: x * y ≤ (–2w–1) 2 = 22w–2
• So, maintaining exact results…
– would need to keep expanding word size with each
product computed
– is done in software, if needed
• e.g., by “arbitrary precision” arithmetic packages
Unsigned Multiplication in C
u •••
Operands: w bits
* v •••

True Product: 2*w bits u · v ••• •••


UMultw(u , v) •••
Discard w bits: w bits

• Standard Multiplication Function


– Ignores high order w bits
• Implements Modular Arithmetic
UMultw(u , v) = u · v mod 2w
Signed Multiplication in C
u •••
Operands: w bits
* v •••

True Product: 2*w bits u · v ••• •••


TMultw(u , v) •••
Discard w bits: w bits

• Standard Multiplication Function


– Ignores high order w bits
– Some of which are different for signed
vs. unsigned multiplication
– Lower bits are the same
Example
• Unsigned: 5 [101] * 3 [011] = 15 [01111] → 7
[111] Truncated
• 101
• 011
• 101
• 101
• 000
----------
• 01111
Example
• Two’s C: -3 [101] * 3 [011] = -9 [110111] → -1 [111]
Truncated
• Need to sign extend and then multiply
• 111101
• 000011
• 111101
• 111101
• 000000
• 000000
• 000000
• 000000
• -----------------
• 000101110111
Power-of-2 Multiply with Shift
• Operation
– u << k gives u * 2k
– Both signed and unsigned k
u •••
Operands: w bits
* 2k 0 ••• 0 1 0 ••• 0 0

True Product: w+k bits u · 2k ••• 0 ••• 0 0

Discard k bits: w bits UMultw(u , 2k) ••• 0 ••• 0 0


• Examples TMultw(u , 2k)

– u << 3 == u * 8
– (u << 5) – (u << 3) == u * 24
– Most machines shift and add faster than multiply
• Compiler generates this code automatically
Unsigned Power-of-2 Divide with Shift
• Quotient of Unsigned by Power of 2
– u >> k gives  u / 2k 
– Uses logical shift
k
u ••• ••• Binary Point
Operands:
/ 2k 0 ••• 0 1 0 ••• 0 0
Division: u / 2k 0 ••• 0 0 ••• . •••

Result:  u / 2k  0 ••• 0 0 •••

Division Computed Hex Binary


x 15213 15213 3B 6D 00111011 01101101
x >> 1 7606.5 7606 1D B6 00011101 10110110
x >> 4 950.8125 950 03 B6 00000011 10110110
x >> 8 59.4257813 59 00 3B 00000000 00111011
Arithmetic: Basic Rules
• Addition:
– Unsigned/signed: Normal addition followed by truncate,
same operation on bit level
– Unsigned: addition mod 2w
• Mathematical addition + possible subtraction of 2w
– Signed: modified addition mod 2w (result in proper range)
• Mathematical addition + possible addition or subtraction of 2w

• Multiplication:
– Unsigned/signed: Normal multiplication followed by truncate,
same operation on bit level
– Unsigned: multiplication mod 2w
– Signed: modified multiplication mod 2w (result in proper
range)
Using Unsigned
• Don’t use without understanding implications
– Easy to make mistakes
unsigned i;
for (i = cnt-2; i >= 0; i--)
a[i] += a[i+1];

– Can be very subtle


#define DELTA sizeof(int)
int i;
for (i = CNT; i-DELTA >= 0; i-= DELTA)
. . .

You might also like