Wu2011cache

Stephen A. Edwards

Wu2011cache

Stephen A. Edwards

visibility

…

description

4 pages

link

1 file

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Hardware acceleration is a widely accepted solution for performance and energy efficient computation because it removes unnecessary hardware for general computation while delivering exceptional performance via specialized control paths and execution units. The spectrum of accelerators available today ranges from coarse-grain off-load engines such as GPUs to fine-grain instruction set extensions such as SSE. This research explores the benefits and challenges of managing memory at the data-structure level and exposing those operations directly to the ISA. We call these instructions Abstract Datatype Instructions (ADIs). This paper quantifies the performance and energy impact of ADIs on the instruction and data cache hierarchies. For instruction fetch, our measurements indicate that ADIs can result in 21-48% and 16-27% reductions in instruction fetch time and energy respectively. For data delivery, we observe a 22-40% reduction in total data read/write time and 9-30% in total data read/write energy.

Walid Najjar

2005

Instruction caches typically consume 27% of the total power in modern high-end embedded systems. We propose a compiler-managed instruction store architecture (K-store) that places the computation intensive loops in a scratchpad like SRAM memory and allocates the remaining instructions to a regular instruction cache. At runtime, execution is switched dynamically between the instructions in the traditional instruction cache and the ones in the K-store, by inserting jump instructions. The necessary jump instructions add 0.038% on an average to the total dynamic instruction count. We compare the performance and energy consumption of our K-store with that of a conventional instruction cache of equal size. When used in lieu of a 8KB, 4-way associative instruction cache, K-store provides 32% reduction in energy and 7% reduction in execution time. Unlike loop caches, K-store maps the frequent code in a reserved address space and hence, it can switch between the kernel memory and the instruction cache without any noticeable performance penalty.

Log In

Wu2011cache

Sign up for access to the world's latest research

Abstract

Related papers

Related papers