Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2017, Handbook of Hardware/Software Codesign
The task of architectural Design Space Exploration (DSE) is extremely complex, with multiple architectural parameters to be tuned and optimized, resulting in a huge design space that needs to be explored efficiently. Furthermore, each architectural parameter and/or design point is critically affected by decisions made at lower levels of abstraction (e.g., layout, choice of transistors, etc.). Ideally designers would like to perform DSE incorporating information and decisions made across multiple layers of design abstraction so that the ensuing design space is both feasible and has good fidelity. Simulation-based methods alone can not deal with this incredibly large and complex design space. To address these issues, this chapter presents an approach for cross-layer architectural DSE that efficiently prunes the large design space and furthermore uses predictive models to avoid expensive simulations. The chapter uses a singlechip heterogeneous single-ISA multiprocessor as an exemplar to demonstrate how the large search space can be covered and evaluated efficiently. A crosslayer approach is presented to cope with the complexity by restricting the search/design space through the use of cross-layer prediction models to avoid too costly full system simulations, coupled with systematic pruning of the design space to enable good coverage of the design space in an efficient manner.
ACM Transactions on Architecture and Code Optimization, 2008
Efficiently exploring exponential-size architectural design spaces with many interacting parameters remains an open problem: the sheer number of experiments required renders detailed simulation intractable. We attack this via an automated approach that builds accurate predictive models. We simulate sampled points, using results to teach our models the function describing relationships among design parameters. The models can be queried and are very fast, enabling efficient design tradeoff discovery. We validate our approach via two uniprocessor sensitivity studies, predicting IPC with only 1-2% error. In an experimental study using the approach, training on 1% of a 250-K-point CMP design space allows our models to predict performance with only 4-5% error. Our predictive modeling combines well with techniques that reduce the time taken by each simulation experiment, achieving net time savings of three-four orders of magnitude.
2014
Heterogeneous multicore processors (HMP) present significant advantages over homogenous multiprocessor due to their improved power, performance, and energy efficiency for a given chip/die area. However, to fully tap their potential, a systematic exploration of their diverse and vast design space is necessary. In this paper, we present a cross-layer (across application, operating system, and hardware architecture layer) design space exploration (DSE) of single-ISA (Instruction Set Architecture) heterogeneous multicore processors. We deploy predictive models to investigate the interactions and influence of heterogeneity (configurations, number and types of cores), multiobjective allocation strategies, along with diverse types of workloads under system level constraints (such as equal area or power budget). We perform a cross-layer DSE study of heterogeneous multicore processors with four simulated annealing (SA) based task allocation strategies that are more generic and efficient in c...
ACM SIGPLAN Notices, 2006
Architects use cycle-by-cycle simulation to evaluate design choices and understand tradeoffs and interactions among design parameters. Efficiently exploring exponential-size design spaces with many interacting parameters remains an open problem: the sheer number of experiments renders detailed simulation intractable. We attack this problem via an automated approach that builds accurate, confident predictive design-space models. We simulate sampled points, using the results to teach our models the function describing relationships among design parameters. The models produce highly accurate performance estimates for other points in the space, can be queried to predict performance impacts of architectural changes, and are very fast compared to simulation, enabling efficient discovery of tradeoffs among parameters in different regions. We validate our approach via sensitivity studies on memory hierarchy and CPU design spaces: our models generally predict IPC with only 1-2% error and reduce required simulation by two orders of magnitude. We also show the efficacy of our technique for exploring chip multiprocessor (CMP) design spaces: when trained on a 1% sample drawn from a CMP design space with 250K points and up to 55× performance swings among different system configurations, our models predict performance with only 4-5% error on average. Our approach combines with techniques to reduce time per simulation, achieving net time savings of three-four orders of magnitude.
2009 International Symposium on Systems, Architectures, Modeling, and Simulation, 2009
Multi-processor Systems-on-chip are currently designed by using platform-based synthesis techniques. In this approach, a wide range of platform parameters are tuned to find the best trade-offs in terms of the selected system figures of merit (such as energy, delay and area). This optimization phase is called Design Space Exploration (DSE) and it generally consists of a Multi-Objective Optimization (MOO) problem.
2012 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2012
System-level design space exploration (DSE), which is performed early in the design process, is of eminent importance to the design of complex multi-processor embedded system architectures. During system-level DSE, system parameters like, e.g., the number and type of processors, the type and size of memories, or the mapping of application tasks to architectural resources, are considered. Simulation-based DSE, in which different design instances are evaluated using system-level simulations, typically are computationally costly. Even using high-level simulations and efficient exploration algorithms, the simulation time to evaluate design points forms a real bottleneck in such DSE. Therefore, the vast design space that needs to be searched requires effective design space pruning techniques. This paper presents a technique to reduce the number of simulations needed during system-level DSE. More specifically, we propose an iterative design space pruning methodology based on static throughput analysis of different application mappings. By interleaving these analytical throughput estimations with simulations, our hybrid approach can significantly reduce the number of simulations that are needed during the process of DSE. 1
Roedunet International Conference ( …, 2010
During the last years, especially due to the computing systems complexity growth, the need for tools which perform automatic design space exploration becomes more and more stringent. This paper presents a new initiated project having as the main aim developing a software tool, called FADSE (Framework for Automatic Design Space Exploration), that comes to meet this need. It is intended to provide out-ofthe-box algorithms capable of solving single and multiobjective optimization problems. It focuses on automatic design space exploration for multicore and manycore systems. This tool is intended to be flexible, to provide easy development and portability.
2008 Symposium on Application Specific Processors, 2008
Multi-Processor System on-Chip (MPSoC) architectures are currently designed by using a platform-based approach. In this approach, a wide range of platform parameters must be tuned to find the best trade-offs in terms of the selected figures of merit (such as energy, delay and area). This optimization phase is called Design Space Exploration (DSE) and it generally consists of a Multi-Objective Optimization (MOO) problem. The design space for an MPSoC architecture is too large to be evaluated comprehensively. So far, several heuristic techniques have been proposed to address the MOO problem for MPSoC, but they are characterized by low efficiency to identify the Pareto front. In this paper, an efficient DSE methodology is proposed leveraging traditional Design of Experiments (DoE) and Response Surface Modeling (RSM) techniques. In particular, the DoE phase generates an initial plan of experiments used to create a coarse view of the target design space; a set of RSM techniques are then used to refine the exploration. This process is iteratively repeated until the target criterion (e.g. number of simulations) is satisfied. A set of experimental results are reported to trade-off accuracy and efficiency of the proposed techniques with actual workloads l •
2007
The vast number of transistors available through modern fabrication technology gives architects an unprecedented amount of freedom in chip-multiprocessor (CMP) designs. However, such freedom translates into a design space that is impossible to fully, or even partially to any significant fraction, explore through detailed simulation. In this paper we propose to address this problem using predictive modeling, a well-known machine learning technique. More specifically we build models that, given only a minute fraction of the design space, are able to accurately predict the behavior of the remaining designs orders of magnitude faster than simulating them.
Proceedings of the 47th Design Automation Conference on - DAC '10, 2010
Given the increasing complexity of multi-processor systems-onchip, a wide range of parameters must be tuned to find the best trade-offs in terms of the selected system figures of merit (such as energy, delay and area). This optimization phase is called Design Space Exploration (DSE) consisting of a Multi-Objective Optimization (MOO) problem. In this paper, we propose an iterative design space exploration methodology exploiting the statistical properties of known system configurations to infer, by means of a correlation-based analysis, the next design points to be analyzed with low-level simulations. In fact, the knowledge of few design points is used to predict the expected improvement of unknown configurations. We show that the correlation of the configurations within the multi-processor design space can be modeled successfully with analytical functions and, thus, speed up the overall exploration phase. This makes the proposed methodology a model-assisted heuristic that, for the first time, exploits the correlation about architectural configurations to converge to the solution of the multi-objective problem 1 .
Journal of Systems Architecture, 2007
A reduction in the time-to-market has led to widespread use of pre-designed parametric architectural solutions known as system-on-a-chip (SoC) platforms. A system designer has to configure the platform in such a way as to optimize it for the execution of a specific application. Very frequently, however, the space of possible configurations that can be mapped onto a SoC platform is huge and the computational effort needed to evaluate a single system configuration can be very costly. In this paper we propose an approach which tackles the problem of design space exploration (DSE) in both of the fronts of the reduction of the number of system configurations to be simulated and the reduction of the time required to evaluate (i.e., simulate) a system configuration. More precisely, we propose the use of Multi-objective Evolutionary Algorithms as optimization technique and Fuzzy Systems for the estimation of the performance indexes to be optimized. The proposed approach is applied on a highly parameterized SoC platform based on a parameterized VLIW processor and a parameterized memory hierarchy for the optimization of performance and power dissipation. The approach is evaluated in terms of both accuracy and efficiency and compared with several established DSE approaches. The results obtained for a set of multimedia applications show an improvement in both accuracy and exploration time.
In this paper, we present a flexible and extensible system-level MP-SoC design space exploration (DSE) infrastructure, called NASA. This highly modular framework uses well-defined interfaces to easily integrate different system-level simulation tools as well as different combinations of search strategies in a simple plugand-play fashion. Moreover, NASA deploys a so-called dimension-oriented DSE approach, allowing designers to configure the appropriate number of, well-tuned and possibly different, search algorithms to simultaneously co-explore the various design space dimensions. As a result, NASA provides a flexible and re-usable framework for the systematic exploration of the multi-dimensional MP-SoC design space, starting from a set of relatively simple user specifications. To demonstrate the capabilities of the NASA framework and to illustrate its distinct aspects, we also present several DSE experiments in which we, e.g., compare NASA configurations using a single search algorithm for all design space dimensions to configurations using a separate search algorithm per dimension. These proof-of-concept experiments indicate that the latter multi-dimensional coexploration can find better design points and evaluates a higher diversity of design alternatives as compared to the more traditional approach of using a single search algorithm for all dimensions.
Eurasip Journal on Embedded Systems, 2006
Fully integrated system level design space exploration methodologies are essential to guarantee efficiency of future large scale system on programmable chip. Each design step in the design flow from system architecture to place and route represents an optimization problem. So far, different tools (computer architecture, design automation) are used to address each problem separately with at best estimation techniques from one level to another. This approach ignores the various and very diverse vertical relations between distinct levels parameters and provides at best local optimization solutions at each step. Due to the large scale of SoC, system level design methodologies need to tackle the system design process as a global optimization problem by fully integrating physical design in the design space exploration. We propose MOCDEX, a multiobjective design space exploration methodology, for multiprocessor on chip which closes the gap between these associated tools in a fully integrated approach and with hardware in the loop. A case study of a 4-way multiprocessor demonstrates the validity of our approach.
Given the increasing complexity of Chip Multi-Processors (CMPs), a wide range of architecture parameters must be explored at design time to find the best trade-off in terms of multiple competing objectives (such as energy, delay, bandwidth, area, etc.) The design space of the target architecture is huge because of it should consider all possible combinations of each parameter (number of processors, processor issue width, L1 and L2 cache sizes, etc.). In this complex scenario, the multi-objective exploration of the huge design space of next generation CMPs cannot be anymore based on intuition and past experience of the design architects. An Automatic Design Space Exploration methodology is needed to support systematically the exploration and the quantitative comparison of the design alternatives in terms of multiple competing objectives (Pareto analysis). An overall design space exploration framework is needed to combine all optimizations into a global search space with a common interface to the simulation and evaluation tools. Our work 1 focuses on the definition of an automatic multi-objective Design Space Exploration (DSE) framework for tuning Chip Multi-Processor architectures by evaluating a set of metrics (such as energy and delay) for the next generation embedded computing platforms. Multicube Explorer is an interactive open-source framework to enable the designer to automatically explore a design space of configurations for a parameterized architecture for which an executable model (use case simulator) exists. Multicube Explorer is an advanced multi-objective optimization framework which is entirely command-line/script driven and can be re-targeted to any configurable platform by writing a suitable XML design space definition file and providing a configurable simulator.
Lecture Notes in Computer Science, 2009
This paper argues the case for the use of analytical models in FPGA architecture layout exploration. We show that the problem when simplified, is amenable to formal optimization techniques such as integer linear programming. However, the simplification process may lead to inaccurate models. To test the overall methodology, we combine the resulting layouts with VPR 5.0. Our results show that the resulting architectures are better than those found using traditional parameter sweeping techniques.
Journal of Systems Architecture, 1999
As microprocessor-based systems grow in complexity, and the processor-memory speed gap widens further, more emphasis needs to be placed on early design space exploration in order to produce the highest performance systems with minimal schedule impact. We discuss the critical issues associated with architectural evaluation of complex microprocessor-based systems, and present a methodology for the comprehensive and semiautomatic evaluation of processor, cache hierarchy, system interconnect, and main memory architectural and technological alternatives. We discuss the implementation of the methodology, and describe how it can be used in early design space exploration. The unique aspects of the methodology are further illustrated through two architectural investigations performed using the toolset.
2012
Networks-on-Chip (NoCs) are increasingly used in many-core architectures. ORION2.0 [18] is a widely adopted NoC power and area estimation tool that is based on circuit-level templates, which use specific logic structures to model implementation of different router components. ORION2.0 estimation models can have large errors (up to 185%) versus actual implementation, often due to a mismatch between the actual router RTL and the templates assumed, as well as the effects of optimization tools in modern design flows. In this work, we propose comprehensive parametric and non-parametric modeling methodologies that fundamentally differ from logic template based approaches in that the estimation models are derived from actual physical implementation data. Specifically, we propose a new parametric modeling methodology as well as improvements to previous work on non-parametric modeling. Our work on parametric modeling proposes (1) ORION NEW models, developed using a new methodology that does ...
2003
The evaluation of the best system-level architecture in terms of energy and performance is of mainly importance for a broad range of embedded SOC platforms. In this paper, we address the problem of the efficient exploration of the architectural design space for parameterized microprocessor-based systems. The architectural design space is multi-objective, so our aim is to find all the Pareto-optimal configurations representing the best power-performance design trade-offs by varying the architectural parameters of the target system. In particular, the paper presents a Design Space Exploration (DSE) framework tuned to efficiently derive Pareto-optimal curves. The main characteristics of the proposed framework consist of its flexibility and modularity, mainly in terms of target architecture, related system-level executable models, exploration algorithms and system-level metrics. The analysis of the proposed framework has been carried out for a parameterized superscalar architecture executing a selected set of benchmarks. The reported results have shown a reduction of the simulation time of up to three orders of magnitude with respect to the full search strategy, while maintaining a good level of accuracy (under 4% on average).
Proceedings of the 41st annual conference on Design automation - DAC '04, 2004
Architectural simulation has achieved a prominent role in the system design cycle by providing designers the ability to quickly examine a wide variety of design choices. However, the recent trend in system design toward architectures that react to circuit-level phenomena has outstripped the capabilities of traditional cycle-based architectural simulators. In this paper, we present an architectural simulator design that incorporates a circuit modeling capability, permitting architectural-level simulations that react to circuit characteristics (such as latency, energy, or current draw) on a cycle-by-cycle basis. While these additional capabilities slow simulation speed, we show that the careful application of circuit simulation optimizations and simulation sampling techniques permit high levels of detail with sufficient speed to examine entire workloads.
IEEE Transactions on Computers, 2000
Continuous advances in silicon technology enable the development of complex System-on-Chip as cooperation among Digital Signal Processors (DPSs), General Purpose Processors (GPPs), and specific hardware components. The impact of this choice is not only limited to the target architecture, but also encompasses the overall system specification. It is thus crucial to manage such a complexity using high-level specification languages and a tool chain supporting the designer throughout a set of strategic decisions, such as the identification of a set of possible target architectures, the verification of the correctness of the specification, and the partitioning of the specification onto a set of computational resources. This paper addresses this type of problem by proposing a design flow supporting the system-level design of heterogeneous multiprocessor system-on-chip (MP-SoC), by extracting information from the system description (e.g., SystemC)-statically and in a fast manner-and by providing a set of quantitative measures correlating the type of executor, the functionality, and a timing estimation. Partitioning and architecture selection are built on top of this data and the final analysis of the selected Hardware-Software solution over the identified candidates is finally submitted to a timing verification via simulation. Note that the possibility of actually performing a comprehensive design space exploration, in general, is tightly influenced by the interaction between partitioning/architecture-selection and timing simulation in the design flow; for this reason, the description of this aspect is particularly emphasized in the presentation of the methodology. To show the applicability of the proposed methodology, two relevant case studies are described in the paper.
Proceedings of the 15th international symposium on System Synthesis - ISSS '02, 2002
Application specific systems have potential for customization of design with a view to achieve a better costperformance-power trade-off. Such customization requires extensive design space exploration. In this paper, we introduce a performance evaluation methodology for system-level design exploration that is much faster than traditional cycleaccurate simulation. The trade off is between accuracy and simulation speed. The methodology is based on probabilistic modeling of system components customized with application behavior. Performance numbers are generated by simulating these models. We have implemented our models using SystemC and validated these for uni-processor as well as multiprocessor systems against various benchmarks.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.