Technische Universitt Mnchen
Chip Multicore Processors
Tutorial 10
S. Wallentowitz
Institute for Integrated Systems Theresienstr. 90 Building N1 www.lis.ei.tum.de
Technische Universitt Mnchen
Task 10.1: Article Discussion
Why on-chip coherency is here to stay Milo M. K. Martin, Mark D. Hill, Daniel J. Sorin Communications of the ACM, July 2012
Chip Multicore Processors Tutorial 9 2 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
Chip Multicore Processors Tutorial 9 3 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
Chip Multicore Processors Tutorial 9 4 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
Task 10.2: Scalability of Interconnects
Given is a simple embedded system with P processor cores and an embedded SRAM 32-bit wide memory. The memory has one cycle latency for accesses.
Chip Multicore Processors Tutorial 9 5 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
a)
A simple data bus (no pipelining) is used to connect the processor cores and the memory. What is the average bandwidth for each processor core assuming all cores generate consecutive memory accesses?
Chip Multicore Processors Tutorial 9 6 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
b)
The bus and memory allow for 4-beat bursts. How does the achievable bandwidth change?
Chip Multicore Processors Tutorial 9 7 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
c)
Develop a simple schematic sketch that shows the differences between a simple bus and a simple crossbar for the given scenario
Chip Multicore Processors Tutorial 9 8 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
Chip Multicore Processors Tutorial 9 9 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
d)
How does the bandwidth change for a crossbar? What is the limiting factor? In what different scenario can the performance be improved?
Chip Multicore Processors Tutorial 9 10 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
e)
How can a bi-directional ring improve the setup with respect to bandwidth and latency? Elaborate scenarios where a ring is advantageous and where it is disadvantageous
Chip Multicore Processors Tutorial 9 11 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
Task 10.3: Turn Model
a) What is the difference between XY-routing and west-first-routing?
Chip Multicore Processors Tutorial 9 12 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
b)
Change the Channel-Dependency diagram of XY-routing so that it reflects west-first routing. Mark all forbidden turns and potentially add new routes.
Chip Multicore Processors Tutorial 9 13 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
c)
Check whether the depicted routing function is a valid turn model. If not, show the potential cycle.
Chip Multicore Processors Tutorial 9 14 S. Wallentowitz
Institute for Integrated Systems