Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2017, International Journal of Engineering Research and
In this paper we illustrate the retiming technique to reduce the iteration period and to minimize the registers. So by this technique computation time is reduced for processors and fast speed is achieved. This is used in real time implementation to optimize performance. Shortest path algorithm such as Bellman ford and Floyd Warshall are used in retiming. These retiming techniques are explained in quantitative manner and the results are given thereafter. Retiming helps in reducing switching time. Minimization of the registers through retiming can help in reducing memory requirement and also reduce the requirement of area. And thereafter power consumption is also reduced due to less area and less switching time as given by the equation of dynamic power.
Journal of emerging technologies and innovative research, 2019
the main component of computation model are energy, space, and time. Minimizing them is challenging in building efficient processing design. This work overcomes the problem of using caching utilizing reversible logic circuits (RLC). Further, high communication overhead among cache and main memory affects performance and incurs high heat in VLSI component such as GPU, CPU etc. Thus, limiting or efficient usage of memory resource is most desired. For using RLC, it is important to reduce the delay and wire length of VLSI circuit. This work present an Information processing unit (IPU) using logically reversible circuit using adiabatic reversible Toffoli gate. For minimizing path delay due to presence of faults in circuit design Elmore model is used both at switch level and at wire level and linear model is used at the gates. Further, for bringing tradeoff between memory minimization and delay (processing time) requirement a delay aware rectilinear Steiner minimum tree (RSMT) is created....
algorithms to optimize power
To achieve the reduction of power consumption, optimizations are required at various levels of the design steps such as algorithm, architecture, logic and circuit & process techniques. This paper considers the two logic level approaches for low power digital design. Optimization techniques are carried to reduce switching activity power of individual logic-gates. we can reduce the power by using either circuit level optimization or logical level optimization. In this paper, the circuit level optimization process is followed to reduce the area and power. In the first approach, Modified gate diffusion input (GDI) logic is used in the proposed parallel asynchronous self time adder (PASTA) technique. Similarly, the structure of XOR gate and half adder is reduced to achieve the low area and low power. In second approach, Multi value logic based digital circuit is designed by increasing the representation domain from the two level (N=2) switching algebra to N > 2 levels. The main advantage of this approach is to compensate the inefficiency of existing integrated circuits that are used to implement the universal set of MVL gates. From the results, the proposed GDL logic based Adder offers less number of transistors (area) and low power consumption than the existing technique. And proposed MVL technique allows designing MVL digital circuit that is set to obtain the values from the binary circuits. Also this technique offers low power and small wiring delay, when compared to binary and three value logic. The simulation process is carried out by tanner toolv14.11 to check the functionality of the PASTA & MVL circuits. A. Proposed Modified Gdi Logic In day today life, the Systems on Chip (SoC) product are necessary. Millions of chip integrated into one single chip is called as SoC. These millions of chip are integrated into single chip by shrinking the transistor size in each and every chip. Therefore this CMOS technique can apply in SoC product [3]. Carry Select Adder (CSLA) is primarily used to minimize the chip size and for reducing the propagation delay. The parallel asynchronous self time adder (PASTA) is working based on iterative coding. So the number of unwanted activation of clock cycle is removed in this adder to achieve the high speed and low power. This type of adder will be designed in this paper in two ways [10]. The Gate Diffusion Input (GDI) technique is proposed in 2002 to reduce the area and power of VLSI digital circuits. The GDI logic was initially proposed for fabrication in twin-well and Silicon on Insulator (SOI) CMOS methods. It enabled the implementation of a broad range of difficult logic functions using simply two transistors. This scheme was appropriate for the design of regular digital circuits, with a much lower area than existing PTL and Static CMOS methods, whereas offering improved power characteristics. Equally to PTL implementations, the GDI circuits suffered from a decreased swing because of threshold drops. Conversely, a considerably shrinked the logic flexibility and transistor count of the basic GDI cell, gives major power reduction, in spite of the need for swing restoration circuits [1]. B. Proposed MVL Logic The MVL is also known as multiple-valued, multi-valued or many-valued logic that traces its origins back to the Lukasiewicz logic and Post algebra. The proposed methodology in this work is based on a universal set of gates that is used to implement operators acting on the elements of a domain. The current trend in Integrated Circuits (IC) is to embed multiple systems onto a single IC, known as System on a Chip (SoCs) leading to, factors like, an increment in the quantity, the delay time, length, and complexity of the interconnections. The multiple-valued logic is a viable alternative to cope up with the issues due to interconnections, as they are said to decrease the number of the interconnections. This reduction in the area of the IC devoted to the interconnections has motivated many MVL proposals. Methodologies for the synthesis of MVL digital circuits comprise of the operators and their properties. Main drawbacks of such methodologies are: first, the lack of existing integrated circuits that implement the universal set of gates and, second minimization tools needed to design practical MVL digital circuits.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2002
This paper investigates the application of simultaneous retiming and clock scheduling for optimizing synchronous circuits under setup and hold constraints. Two optimization problems are explored: (1) clock period minimization and (2) tolerance maximization to clock-signal delay variations. Exact mixed-integer linear programming formulations and efficient heuristics are given for both problems. When both long and short paths are considered, circuits optimized by the combined application of retiming and clock scheduling can achieve shorter clock periods or demonstrate greater tolerance to clock-signal delay variations than circuits optimized by retiming or clock scheduling. Experiments with benchmark circuits demonstrate the effectiveness of the combined optimization. In comparison with the best result obtained by either of the two optimizations, the joint application of retiming and clock scheduling increased operating speeds by more than 8% on the average. It also increased tolerance to clock delay variations by an average of 12% over a broad range of target clock frequencies. Larger relative improvements were achieved for shorter clock periods, thus suggesting that simultaneous retiming and clock scheduling can play an important role in high-speed design.
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2013
Power has become a burning issue in modern VLSI design. In modern integrated circuits, the power consumed by clocking gradually takes a dominant part. Given a design, we can reduce its power consumption by replacing some flip-flops with fewer multi-bit flip-flops. However, this procedure may affect the performance of the original circuit. Hence, the flip-flop replacement without timing and placement capacity constraints violation becomes a quite complex problem. To deal with the difficulty efficiently, we have proposed several techniques. First, we perform a coordinate transformation to identify those flipflops that can be merged and their legal regions. Besides, we show how to build a combination table to enumerate possible combinations of flip-flops provided by a library. Finally, we use a hierarchical way to merge flip-flops. Besides power reduction, the objective of minimizing the total wirelength is also considered. The time complexity of our algorithm is (n 1.12) less than the empirical complexity of (n 2). According to the experimental results, our algorithm significantly reduces clock power by 20-30% and the running time is very short. In the largest test case, which contains 1 700 000 flip-flops, our algorithm only takes about 5 min to replace flip-flops and the power reduction can achieve 21%.
In VLSI circuit, space, power consumption, & speed are all significant design considerations. On overall performance of circuits, design component has contradictory effect. Compromises in various components can be used to optimise power dissipation. In VLSI circuits (such as multipliers), power consumption is also data dependent. goal of this study is to compare different design techniques & suggest modular strategy for reducing power usage. It has been discovered that algorithm-based design reduces gate switching activity and, as result, reduces multiplier power consumption. While utilising partly guarded methodology, power consumption is decreased by 10-44 percent with 30-36 percent less area overhead, while using temporal tilling method, array multiplier delay & power dissipation are observed to rise by 50 percent & 30 percent, respectively. Wallace tree multiplier recorded by Booth is determined to be 67% quicker than Wallace tree multiplier, 53% faster than Vedic multiplier, & 22% faster than radix 8 booth multipliers. For Wallace multiplier, bypassing multiplier, modified booth multiplier, & Vedic multiplier, we investigate several optimization approaches. Arithmetic operations, particularly multiplication operations, use significant amount of processing time in conventional processor central processing unit. Multiplication is fundamental mathematical operation that needs significantly more hardware & processing time than addition & subtraction.
ACM Transactions on Design Automation of Electronic Systems, 1999
We present an efficient technique to reduce the switching activity in a technology-mapped CMOS combinational circuit based on local logic transformations. The transformations consist of adding redundant connections or gates so as to reduce switching activity. We describe simple and efficient procedures, based on logic implication, for identifying the sources and targets of the redundant connections. Additionally, we give procedures that permit the designer to trade-off power and delay after the transformations. Results of experiments on both the MCNC benchmark circuits and the circuits of a PowerPC microprocessor chip are given. The results indicate that significant power reduction of a CMOS combinational circuit can be achieved with very low area overhead, delay penalty, and computational cost.
Low power consuming devices are playing a dominant role in the present day VLSI design technology. If the power consumption is less, then the amount of power dissipation is also less. The power dissipation of a device can be reduced by using different low power techniques. In the present paper the performance of 4x1 multiplexer in different low power techniques was analyzed and its power dissipation in those techniques is compared with the conventional CMOS design. Each of these techniques has different advantages depending on their logic of operation. The simulation results show that the proposed techniques have less power dissipation compared to the conventional CMOS with reduction in area also.
2018
This research paper deals with design and implementation of low power 8-bit arithmetic logic units. The main part of power consumption is consumed in ALU in any processor. Therefore, reducing power dissipation in ALU should be requiring. The proposed technique disabled one of the main block of ALU using tri-state logic which is not necessary to use, except the required processes. In this work, the suggested design is realized by using ASIC methodologies. In order to implement the arithmetic and logic architectures, 130 nm standard cell libraries are used for ASIC execution. The architecture of the design has been created using Verilog HDL language. In addition, it is simulated using ModelSim-Altera 10.3c (Quartus II 14.1) tools. By using tri-state technique, dynamic power and total power are decreased.
1994
Reducing switching activity would significantly reduce power consumption of a processor chip. In this paper, we present two novel techniques, Gray code addressing and Cold scheduling, for reducing switching activity on high performance processors.
1993
In this paper we address the problem of optimization of VLSI circuits to minimize power consumptioin while meeting performance goals. We present a method of estimating power consumptioin of a basic or complex CMOS gate which takes the internal cap,acitances of the gate into account. This method is used to select an ordering of series-connected transistors found in CMOS gates to achieve lower power consumption. The method is very efficient when used by library based design styles. We describe a multi-pass algorithm which makes use of transisto1 reordering to optimize performance and power consumption of circuits, which has a linear time complexity per pass and which converges to a solution in ,X small number of passes. Transformations besides transistor reordering can be used by the algorithm. The algorithm h i~s been benchmarked on several large examples and the results are presented.
In this paper we have presented clock gating process for low power VLSI (very large scale integration) circuit design. Clock gating is one of the most quite often used systems in RTL to shrink dynamic power consumption without affecting the performance of the design. One process involves inserting gating requisites in the RTL, which the synthesis tool translates to clock gating cells in the clock-path of a register bank. This helps to diminish the switching activity on the clock network, thereby decreasing dynamic power consumption within the design. Due to the fact the translation accomplished via the synthesis tool is solely combinational; it is referred to as combinational clock gating. This transformation does not alter the behavior of the register being gated.
Proceedings of the 13th ACM Great Lakes Symposium on VLSI - GLSVLSI '03, 2003
We address the problem of minimizing dynamic power consumption for single-phase synchronous digital designs, under timing constraints, using an unification of basic retiming and supply voltage scaling. We assume that the number of supply voltages and their values are known for each computation element. Our main objective is then to change the location of registers using basic retiming while maximizing the number of computation elements off critical paths that can operate under a low available supply voltage, and can lead to a maximum dynamic power saving. We address the problem at the system-level. We formulate the problem as a Mixed Integer Linear Program (MILP). The exact optimal solution for the problem is then guaranteed. We also devise an algorithm to compute bounds on the values assigned by basic retiming to each computational element. Besides helping to find the optimal solution to the problem, these bounds also allow to reduce the run-time for finding this solution. The proposed approach can produce converter-free designs and can also minimize short-circuit power consumption. Experimental results have shown that dynamic power consumption can be reduced by factors that range from 2.78% to 37.24% for single-phase designs with minimal clock period. For these experimental results, the run-time for solving the MILP is under 2min.
ISCAS '98. Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (Cat. No.98CH36187), 1998
In this paper a novel approach for low power realization of DSP algorithms that are based on inner product computation is proposed. Inner product computation between data and coefficients is a very common computational structure in DSP algorithms. The proposed methodology is based on an architectural transformation that reorders the sequence of evaluation of the partial products forming the inner products. The total Hamming distance of the sequence of coefficients, which are known before realization, is used as the cost function driving the reordering. The reordering of computation reduces the switching activity at the inputs of the computational units. Experimental results show that the proposed methodology leads to significant savings in switching activity and thus in power consumption
arXiv (Cornell University), 2024
Industrial datapath designers consider dynamic power consumption to be a key metric. Arithmetic circuits contribute a major component of total chip power consumption and are therefore a common target for power optimization. While arithmetic circuit area and dynamic power consumption are often correlated, there is also a tradeoff to consider, as additional gates can be added to explicitly reduce arithmetic circuit activity and hence reduce power consumption. In this work, we consider two forms of power optimization and their interaction: circuit area reduction via arithmetic optimization, and the elimination of redundant computations using both data and clock gating. By encoding both these classes of optimization as local rewrites of expressions, our tool flow can simultaneously explore them, uncovering new opportunities for power saving through arithmetic rewrites using the e-graph data structure. Since power consumption is highly dependent upon the workload performed by the circuit, our tool flow facilitates a data dependent design paradigm, where an implementation is automatically tailored to particular contexts of data activity. We develop an automated RTL to RTL optimization framework, ROVER, that takes circuit input stimuli and generates power-efficient architectures. We evaluate the effectiveness on both open-source arithmetic benchmarks and benchmarks derived from Intel production examples. The tool is able to reduce the total power consumption by up to 33.9%.
IOSR Journal of Electrical and Electronics Engineering, 2014
Space, power consumption and speed are major design issues in VLSI circuit. The design component has conflicting affect on overall performance of circuits. An optimization of power dissipation can be achieved by compromising various components. Power consumption in VLSI circuit (like in multipliers) is also data dependent. In this paper attempt has been made to test different design methods and propose a modular approach for optimizing power consumption. It is found that algorithm based design reduce gate switching activity considerably and as result power consumption in multiplier is reduced.
ACM SIGARCH Computer Architecture News, 2011
This paper examines the effectiveness of employing precomputation techniques to reduce power consumption of field configurable computing systems. Multiplier is modified with precomputation techniques and are implemented using commercial off-the-shelf FPGAs. Precomputation techniques reduce dynamic power consumption of a module by eliminating unnecessary signal switching activities in inactive portions of the modules. Experiments have shown that up to 52% of logic and signal power consumption can be reduced in multiplier module. Furthermore, when compared to ASIC implementations, FPGA implementations of precomputation modules have the advantage of lower area overhead as most of them can be implemented using originally unoccupied related FPGA resources. Finally, it was found that the effectiveness of precomputation depends heavily on the input data statistics. It is expected that compilers for future reconfigurable computers may take full advantage of such power saving techniques by optimizing the architecture according to data input statistics.
2012
The continued scaling of the CMOS technology has led us into the deep submicron regimes where design is not limited by the functionality on a chip but is constrained with its power consumption. In this paper, we present some widely used techniques for static and dynamic power minimisation in modern VLSI circuits. These techniques are applicable on the different stages of the system design, starting from technology level where designer is allowed to change technology parameters (transistor sizes, supply and threshold voltages) up to the top level which deals with the design's architectural variations. Along with the overview of power minimisation techniques, as an example, the circuit of binary divider was introduced and implemented in various families FPGAs to demonstrate technological as well as Placement and Routing (PAR) influence on total power consumption. . His current research interests include power estimation and minimisation techniques, digital IC design, real-time and embedded systems, SoCs and programmable logic devices.
2006 International Conference on Communications, Circuits and Systems, 2006
This paper presents a partition-based retiming approach to reduce dynamic power in CMOS circuits. More precisely, the algorithm first partitions the original circuit into some subcircuits, effectively reducing the computation complexity. It then applies retiming technique among these subcircuits, while precomputing some subcircuits with enough size and single output. We experiment the low-power technique with ten MCNC benchmarks, and the average reduction of power can be 43%, 4% higher than previous methods.
International Journal of Innovative Technology and Exploring Engineering (IJITEE), 2019
There is number of computations involved at every stage in Digital Signal Processing (DSP). At every stage of computation we have addition and multiplication of the terms derived from previous and presents stages. The general computation incorporates the use of normal multiplication and addition, but the circuitry of normal multiplication and addition is lethargic i.e., it consumes more space on chip, consumes more power and the speed of computation is also low.These drawbacks can be avoided by switching to proposed method called Multiplication and Accumulation (MAC). Aim of this project is to develop an Area optimized Low power digital circuit for MAC (Multiply and Accumulate) operation. We develop the Verilog Hardware Description Language code for the various implementations of the MAC (Multiply and Accumulate) that is we try to avoid using multipliers and prefer to use the combinational circuits like multiplexers. These Verilog HDL codes will be simulated to check the functionality. Once we get the expected results we go for the implementation of the digital circuits. We analyze all the MAC digital circuits to find out the best digital circuit which consumes minimum area and power. The importance of MAC in FPGA designs is explained by some filter designs. We also give some suggestions on the system level solutions based on the MAC.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.