Systems I
Machine-Level Programming I:
Introduction
Topics
Assembly Programmers
Execution Model
Accessing Information
Registers
IA32 Processors
Totally Dominate General Purpose CPU Market
Evolutionary Design
Starting in 1978 with 8086
Added more features as time goes on
Still support old features, although obsolete
Complex Instruction Set Computer (CISC)
Many different instructions with many different formats
But, only small subset encountered with Linux programs
Hard to match performance of Reduced Instruction Set
Computers (RISC)
But, Intel has done just that!
2
X86 Evolution: Programmers View
Name
8086
386
Transistors
1978
29K
16-bit processor. Basis for IBM PC & DOS
Limited to 1MB address space. DOS only
gives you 640K
80286
Date
1982
134K
Added elaborate, but not very useful,
addressing scheme
Basis for IBM PC-AT and Windows
1985
275K
Extended to 32 bits. Added flat
addressing
Capable of running Unix
Linux/gcc uses no instructions introduced
in later models
3
X86 Evolution: Programmers View
Name
486
Date
Transistors
1989
1.9M
Added on-chip floating-point unit
Pentium
1993
3.1M
Pentium/MMX
1997
4.5M
Added special collection of instructions
for operating on 64-bit vectors of 1, 2, or
4 byte integer data
PentiumPro
1995
6.5M
Added conditional move instructions
Hardware can execute instructions out of
order
4
X86 Evolution: Programmers View
Name
Pentium III
1999
8.2M
2001
42M
Added 8-byte formats and 144 new
instructions for streaming SIMD mode
Superpipelined with very fast clocks
Nehalem
Transistors
Added streaming SIMD instructions
for operating on 128-bit vectors of 1, 2,
or 4 byte integer or floating point data
Pentium 4
Date
2009
700M+
4 Cores on the same chip
8MB+ of on-chip memory
X86 Evolution: Clones
Advanced Micro Devices (AMD)
Historically
AMD has followed just behind Intel
A little bit slower, a lot cheaper
Recently
Drove 64-bit extensions to IA32 architecture
Acquired ATI (graphics chip company)
Increasing core counts too
6-core Opteron (Istanbul) 2009
Variety of x86 chips in different markets
Embedded/low power (Atom, Neo)
Desktop/laptop
Server
Supercomputer
6
Abstract and Concrete Machine
Models
Machine Models
C
mem
proc
Assembly
mem
Stack
regs
alu
Cond.
processor
Codes
Data
1) char
2) int, float
3) double
4) struct, array
5) pointer
Control
1) loops
2) conditionals
3) switch
4) Proc. call
5) Proc. return
1) byte
3) branch/jump
2) 2-byte word
4) call
3) 4-byte long word 5) ret
4) contiguous byte allocation
5) address of initial byte
7
Assembly Programmers View
CPU
Memory
Addresses
E
I
P
Registers
Data
Condition
Codes
Instructions
Object Code
Program Data
OS Data
Stack
Programmer-Visible State
EIP
Program Counter
Address of next instruction
Register File
Heavily used program data
Condition Codes
Store status information about
most recent arithmetic operation
Used for conditional branching
Memory
Byte addressable array
Code, user data, (some) OS
data
Includes stack used to
support procedures
By the architecture of a system, I mean the complete and
detailed specification of the user interface. As Blaauw
has said, Where architecture tells what happens,
implementation tells how it is made to happen.
The Mythical Man-Month, Brooks, pg 45
Slide adapted from M.J. Irwin, 2005
Instruction Set Architecture
Principles
Contract between programmer and the hardware
Defines visible state of the system
Defines how state changes in response to instructions
Programmer: ISA is model of how a program will execute
Hardware Designer: ISA is formal definition of the correct way to
execute a program
With a stable ISA, SW doesnt care what the HW looks like under
the covers
Hardware implementations can change (drastically)
As long as the HW implements the same ISA, all prior SW will still run
Example: x86 ISA has spanned many chips
Instructions have been added but the SW of prior chips still runs on
later chips
ISA specification
The binary encodings of the instruction set
10
Instruction Set Architecture
Contract between programmer and the hardware
Defines visible state of the system
Defines how state changes in response to instructions
Programmer: ISA is model of how a program will
execute
Hardware Designer: ISA is formal definition of the
correct way to execute a program
ISA specification
The binary encodings of the instruction set
11
ISA Basics
Instruction formats
Instruction types
Addressing modes
instruction
Op Mode Ra Rb
Mem
Mem
Regs
Before State
Data types
Operations
Interrupts/Events
Regs
After State
Machine state
Memory organization
Register organization
12
Architecture vs. Implementation
Architecture: defines what a computer system
does in response to a program and a set of
data
Programmer visible elements of computer system
Implementation: defines how a computer does
it
Sequence of steps to complete operations
Time to execute each operation
Hidden bookkeeping functions
13
Examples
Architecture or Implementation?
Number of GP registers
Width of memory bus
Binary representation of the instruction
sub r4,r2,#27
Number of cycles to execute FP instruction
How condition code bits are set on a move instruction
Size of the instruction cache
Type of FP format
14
Turning C into Object Code
Code in files p1.c p2.c
Compile with command: gcc -O p1.c p2.c -o p
Use optimizations (-O)
Put resulting binary in file p
text
C program (p1.c p2.c)
Compiler (gcc -S)
text
Asm program (p1.s p2.s)
Assembler (gcc or as)
binary
Object program (p1.o p2.o)
Static libraries
(.a)
Linker (gcc or ld)
binary
Executable program (p)
15
Compiling Into Assembly
C Code
int sum(int x, int y)
{
int t = x+y;
return t;
}
Generated Assembly
_sum:
pushl %ebp
movl %esp,%ebp
movl 12(%ebp),%eax
addl 8(%ebp),%eax
movl %ebp,%esp
popl %ebp
ret
Obtain with command
gcc -O -S code.c
Produces file code.s
16
Assembly Characteristics
Minimal Data Types
Integer data of 1, 2, or 4 bytes
Data values
Addresses (untyped pointers)
Floating point data of 4, 8, or 10 bytes
No aggregate types such as arrays or structures
Just contiguously allocated bytes in memory
Primitive Operations
Perform arithmetic function on register or memory data
Transfer data between memory and register
Load data from memory into register
Store register data into memory
Transfer control
Unconditional jumps to/from procedures
Conditional branches
17
Object Code
Code for sum
Assembler
Translates .s into .o
Some libraries are dynamically linked
0x401040 <sum>:
Binary encoding of each instruction
0x55
Total of 13
0x89
Nearly-complete image of executable
bytes
0xe5
code
Each
0x8b
instruction 1,
Missing linkages between code in
0x45
2, or 3 bytes
different files
0x0c
Starts at
0x03
address
Linker
0x45
0x401040
0x08
Resolves references between files
0x89
Combines with static run-time
0xec
libraries
0x5d
E.g., code for malloc, printf
0xc3
Linking occurs when program begins
execution
18
Machine Instruction Example
C Code
int t = x+y;
Add two signed integers
Assembly
addl 8(%ebp),%eax
Similar to
expression
x += y
Add 2 4-byte integers
Long words in GCC parlance
Same instruction whether
signed or unsigned
Operands:
x:
y:
t:
0x401046:
03 45 08
Register
%eax
Memory
M[%ebp+8]
Register
%eax
Return function value in %eax
Object Code
3-byte instruction
Stored at address 0x401046
19
Disassembling Object Code
Disassembled
00401040 <_sum>:
0:
55
1:
89 e5
3:
8b 45 0c
6:
03 45 08
9:
89 ec
b:
5d
c:
c3
d:
8d 76 00
push
mov
mov
add
mov
pop
ret
lea
%ebp
%esp,%ebp
0xc(%ebp),%eax
0x8(%ebp),%eax
%ebp,%esp
%ebp
0x0(%esi),%esi
Disassembler
objdump -d p
Useful tool for examining object code
Analyzes bit pattern of series of instructions
Produces approximate rendition of assembly code
Can be run on either a.out (complete executable) or .o file
20
Alternate Disassembly
Disassembled
Object
0x401040:
0x55
0x89
0xe5
0x8b
0x45
0x0c
0x03
0x45
0x08
0x89
0xec
0x5d
0xc3
0x401040
0x401041
0x401043
0x401046
0x401049
0x40104b
0x40104c
0x40104d
<sum>:
<sum+1>:
<sum+3>:
<sum+6>:
<sum+9>:
<sum+11>:
<sum+12>:
<sum+13>:
push
mov
mov
add
mov
pop
ret
lea
%ebp
%esp,%ebp
0xc(%ebp),%eax
0x8(%ebp),%eax
%ebp,%esp
%ebp
0x0(%esi),%esi
Within gdb Debugger
gdb p
disassemble sum
Disassemble procedure
x/13b sum
Examine the 13 bytes starting at sum
21
What Can be Disassembled?
% objdump -d WINWORD.EXE
WINWORD.EXE:
file format pei-i386
No symbols in "WINWORD.EXE".
Disassembly of section .text:
30001000 <.text>:
30001000: 55
30001001: 8b ec
30001003: 6a ff
30001005: 68 90 10 00 30
3000100a: 68 91 dc 4c 30
push
mov
push
push
push
%ebp
%esp,%ebp
$0xffffffff
$0x30001090
$0x304cdc91
Anything that can be interpreted as executable code
Disassembler examines bytes and reconstructs assembly
source
22
Whose Assembler?
Intel/Microsoft Format
GAS/Gnu Format
lea
sub
cmp
mov
leal
subl
cmpl
movl
eax,[ecx+ecx*2]
esp,8
dword ptr [ebp-8],0
eax,dword ptr [eax*4+100h]
(%ecx,%ecx,2),%eax
$8,%esp
$0,-8(%ebp)
$0x100(,%eax,4),%eax
Intel/Microsoft Differs from GAS
Operands listed in opposite order
mov Dest, Src
Constants not preceded by $, Denote hex with h at end
100h
$0x100
Operand size indicated by operands rather than operator suffix
sub
movl Src, Dest
subl
Addressing format shows effective address computation
[eax*4+100h]
$0x100(,%eax,4)
23
Moving Data
%eax
%edx
Moving Data
movl Source,Dest:
Move 4-byte (long) word
Lots of these in typical code
Operand Types
Immediate: Constant integer data
Like C constant, but prefixed with $
%ecx
%ebx
%esi
%edi
%esp
%ebp
E.g., $0x400, $-533
Encoded with 1, 2, or 4 bytes
Register: One of 8 integer registers
But %esp and %ebp reserved for special use
Others have special uses for particular instructions
Memory: 4 consecutive bytes of memory
Various address modes
24
movl Operand Combinations
Source
movl
Destination
C Analog
movl $0x4,%eax
temp = 0x4;
movl $-147,(%eax)
*p = -147;
Imm
Reg
Mem
Reg
Reg
movl %eax,%edx
temp2 = temp1;
Mem
movl %eax,(%edx)
*p = temp;
Reg
movl (%eax),%edx
temp = *p;
Mem
Cannot do memory-memory transfers with single
instruction
25
Simple Addressing Modes
Normal
(R)
Mem[Reg[R]]
Register R specifies memory address
movl (%ecx),%eax
Displacement
D(R)
Mem[Reg[R]+D]
Register R specifies start of memory region
Constant displacement D specifies offset
movl 8(%ebp),%edx
26
Summary
Today
ISA/processor evolution (for x86)
Programmer machine models
Introduction to ISA and usage
Next time
Memory access
Arithmetic operations
C pointers and Addresses
27