Modern
Processors
Vipin Vasu
Stored
Program
Computer
Architecture
General-
purpose
cache-based
microprocessor
Modern Processors
architecture
Performance
metrics and
benchmarks
Vipin Vasu
Low-Level
Bench Mark
Transistors
galore:
Moore’s Law
Pipelining
Superscalarity
SIMD
Conclusion
Modern
Processors
Vipin Vasu Outline
Stored
Program 1 Stored Program Computer Architecture
Computer
Architecture
General- 2 General-purpose cache-based microprocessor architecture
purpose
cache-based
microprocessor 3 Performance metrics and benchmarks
architecture
Performance
metrics and
4 Low-Level Bench Mark
benchmarks
Low-Level 5 Transistors galore: Moore’s Law
Bench Mark
Transistors
galore:
6 Pipelining
Moore’s Law
Pipelining 7 Superscalarity
Superscalarity
SIMD 8 SIMD
Conclusion
9 Conclusion
Modern
Processors
Vipin Vasu Stored Program Computer
Stored Architecture
Program
Computer
Architecture
General-
purpose
cache-based
microprocessor
architecture
Performance
metrics and
benchmarks
Low-Level
Bench Mark
Transistors
galore:
Moore’s Law
Pipelining
Superscalarity
SIMD
Conclusion
Modern
Processors
Vipin Vasu Stored Program Computer
Stored Architecutre
Program
Computer
Architecture
General-
purpose
cache-based
microprocessor • Instructions and data must be continuously fed to the
architecture
Performance
control and arithmetic units, so that the speed of the
metrics and
benchmarks
memory interface poses a limitation on compute
Low-Level
performance.
Bench Mark
• The architecture is inherently sequential, processing a
Transistors
galore: single instruction with (possibly) a single operand or a
Moore’s Law
group of operands from memory.(SISD)
Pipelining
Superscalarity
SIMD
Conclusion
Modern
Processors
Vipin Vasu General-purpose cache-based
Stored microprocessor architecture
Program
Computer
Architecture
General-
purpose
cache-based • Microprocessors implement stored pgm....
microprocessor
architecture • Modern processors have lot of componets but only a small
Performance
metrics and part does the actual work -AU for fp and int operations.
benchmarks
• Rest are CPU regs,nowdays processors req all operands to
Low-Level
Bench Mark reside in regs.
Transistors
galore:
• LD(load) and ST(store) units handle instruction tranfer.
Moore’s Law
• Queues for instructions
Pipelining
Superscalarity • Finally Cache
SIMD
Conclusion
Modern
Processors
Vipin Vasu General-purpose cache-based
Stored microprocessor architecture
Program
Computer
Architecture
General-
purpose
cache-based
microprocessor
architecture
Performance
metrics and
benchmarks
Low-Level
Bench Mark
Transistors
galore:
Moore’s Law
Pipelining
Superscalarity
SIMD
Conclusion
Modern
Processors
Vipin Vasu Performance metrics and
Stored benchmarks
Program
Computer
Architecture
General-
• Cpu components can operate at a peak performance
purpose
cache-based • Need to quatify this “speed”-DP and SP
microprocessor
architecture • The performance at which the FP units generate results
Performance
metrics and
for multiply and add operations is measured in
benchmarks floating-point operations per second (Flops/sec). 2-4 DP
Low-Level
Bench Mark
in one cycle:clock freq 2-3ghz
Transistors • 4-12GFlops/Sec
galore:
Moore’s Law • Data speed based on Main Memory and Cache tranfer
Pipelining
speed.
Superscalarity
• The performance, or bandwidth of thosepaths is quantified
SIMD
Conclusion
in GBytes/sec.
Modern
Processors
Vipin Vasu Low-Level Bench Mark
Stored
Program
Computer
Architecture
General- • A low-level benchmark is a program that tries to test some
purpose
cache-based specific feature like, e.g., peak performance or memory
microprocessor
architecture bandwidth
Performance • On standard microprocessors, performance grows with N
metrics and
benchmarks until some maximum is reached, followed by several
Low-Level
Bench Mark
sudden breakdowns. performance stays constant for very
Transistors
large loops.
galore:
Moore’s Law • In order to decide whether some CPU or architecture is
Pipelining well-suited for some application the only safe way is to
Superscalarity prepare application benchmarks.
SIMD
Conclusion
Modern
Processors
Vipin Vasu Transistors galore: Moore’s Law
Stored
Program
Computer
Architecture
General-
purpose
cache-based • Pipelined functional units
microprocessor
architecture
• Superscalar architecture
Performance
metrics and • Data parallelism through SIMD instructions
benchmarks
Low-Level • Out-of-order execution
Bench Mark
Transistors
• Larger caches
galore:
Moore’s Law • Simplified instruction set(CISC to RISC)
Pipelining
Superscalarity
SIMD
Conclusion
Modern
Processors
Vipin Vasu Pipelining
Stored
Program
Computer
Architecture
General-
purpose
cache-based
microprocessor
architecture
Performance
metrics and
The most simple setup is a “fetch–decode–execute” pipeline, in
benchmarks which each stage can operate indepen-dently of the others
Low-Level
Bench Mark
Transistors
galore:
Moore’s Law
Pipelining
Superscalarity
SIMD
Conclusion
Modern
Processors
Vipin Vasu Superscalarity
Stored
Program
Computer
Architecture
General-
purpose
cache-based
microprocessor If a processor is designed to be capable of executing more than
architecture
one instruction or, more generally, producing more than one
Performance
metrics and “result” per cycle.
benchmarks
• Multiple instructions can be fetched and decoded
Low-Level
Bench Mark concurrently
Transistors
galore: • Multiple floating-point pipelines can run in parallel
Moore’s Law
Pipelining
Superscalarity
SIMD
Conclusion
Modern
Processors
Vipin Vasu SIMD
Stored
Program
Computer
Architecture
General-
purpose
cache-based
microprocessor
• The SIMD concept became widely known with the first
architecture
vector supercomputers in the 1970s
Performance
metrics and • They allow the concurrent execution of arithmetic
benchmarks
Low-Level
operations on a “wide” register that can hold, 2DP or 4SP.
Bench Mark
• A single instruction can initiate four additions at once.
Transistors
galore: • Can be parellel or a single pipeline.
Moore’s Law
Pipelining
Superscalarity
SIMD
Conclusion
Modern
Processors
Vipin Vasu The End
Stored
Program
Computer
Architecture
General-
purpose
cache-based
microprocessor
architecture
Performance
metrics and
benchmarks
Low-Level
Bench Mark
Transistors
galore:
Moore’s Law
Pipelining
Superscalarity
SIMD
Conclusion