Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2019, TELKOMNIKA Telecommunication Computing Electronics and Control
https://doi.org/10.12928/telkomnika.v17i3.12409…
6 pages
1 file
This paper investigates about the possibility to reduce power consumption in Neural Network using approximated computing techniques. Authors compare a traditional fixed-point neuron with an approximated neuron composed of approximated multipliers and adder. Experiments show that in the proposed case of study (a wine classifier) the approximated neuron allows to save up to the 43% of the area, a power consumption saving of 35% and an improvement in the maximum clock frequency of 20%.
Electronics, 2021
Binarized neural networks (BNNs), which have 1-bit weights and activations, are well suited for FPGA accelerators as their dominant computations are bitwise arithmetic, and the reduction in memory requirements means that all the network parameters can be stored in internal memory. However, the energy efficiency of these accelerators is still restricted by the abundant redundancies in BNNs. This hinders their deployment for applications in smart sensors and tiny devices because these scenarios have tight constraints with respect to energy consumption. To overcome this problem, we propose an approach to implement BNN inference while offering excellent energy efficiency for the accelerators by means of pruning the massive redundant operations while maintaining the original accuracy of the networks. Firstly, inspired by the observation that the convolution processes of two related kernels contain many repeated computations, we first build one formula to clarify the reusing relationships...
Proceedings of the 2021 International Conference on Compilers, Architectures, and Synthesis for Embedded Systems, 2021
We present an architectural approach toward energy-efficient synthesis of circuits used in neural processing units. Neural network applications are shown to tolerate varying operand precisions between different inputs, accuracy targets, their phases, and learning methods, without significantly impacting the classification accuracy. Using multiple instances of systolic arrays at different precisions, we show that significant energy gains are possible beyond the conventional approach, using the same circuit for all precisions.
IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2018
Deep Neural Networks (DNNs) have emerged as the state-of-the-art technique in a wide range of machine learning tasks for analytics and computer vision in the next generation of embedded (mobile, IoT, wearable) devices. Despite their success, they suffer from high energy requirements. In recent years, the inherent error resiliency of DNNs has been exploited by introducing approximations at either the algorithmic or the hardware levels (individually) to obtain energy savings while incurring tolerable accuracy degradation. However, there is a need for investigating the overall energy-accuracy trade-offs arising from the introduction of approximations at different levels in complex DNNs. We perform a comprehensive analysis to determine the effectiveness of cross-layer approximations for the energy-efficient realization of large-scale DNNs. The approximations considered are as follows: (i) use of lower complexity networks (containing lesser number of layers and/or neurons per layer), (ii) pruning of synaptic weights, (iii) approximate multiplication operation in the neuronal MAC (Multiply-and-Accumulate) computation, and (iv) approximate write/read operations to/from the synaptic memory. Our experiments on recognition benchmarks (MNIST, CIFAR10) show that cross-layer approximation provides substantial improvements in energy efficiency for different accuracy/quality requirements. Furthermore, we propose a synergistic framework for combining the approximation techniques to achieve maximal energy benefits from approximate DNNs.
2020 IEEE 63rd International Midwest Symposium on Circuits and Systems (MWSCAS), 2020
Hardware-based machine learning is becoming increasingly popular due to its high speed of computation. One of the desired characteristics of such hardware is reduced hardware and design costs. This paper proposes a design approach for a neural network to reduce the cost of hardware in terms of adders and multipliers. Adders and multipliers are parts of the main components in the neural network, and they are used in each node in the network. The proposed approach reduces the number of multipliers and adders in the network by half, which reduces the cost. The proposed technique is based on sharing multiplier and adder between two hidden layers. The method has been tested and validated using multiple datasets. The accuracy of the proposed approach is similar to the traditional methods in the literature, while the proposed approach utilizes only half the number of multipliers and adders. The proposed design is implemented using VHDL and Altera Arria 10 GX FPGA. The simulation result shows the proposed method retains the performance of the network with a 63% reduction in the hardware design with acceptable accuracy.
Computer Systems Science and Engineering
Approximate computing is a popular field for low power consumption that is used in several applications like image processing, video processing, multimedia and data mining. This Approximate computing is majorly performed with an arithmetic circuit particular with a multiplier. The multiplier is the most essential element used for approximate computing where the power consumption is majorly based on its performance. There are several researchers are worked on the approximate multiplier for power reduction for a few decades, but the design of low power approximate multiplier is not so easy. This seems a bigger challenge for digital industries to design an approximate multiplier with low power and minimum error rate with higher accuracy. To overcome these issues, the digital circuits are applied to the Deep Learning (DL) approaches for higher accuracy. In recent times, DL is the method that is used for higher learning and prediction accuracy in several fields. Therefore, the Long Short-Term Memory (LSTM) is a popular time series DL method is used in this work for approximate computing. To provide an optimal solution, the LSTM is combined with a meta-heuristics Jellyfish search optimisation technique to design an input aware deep learning-based approximate multiplier (DLAM). In this work, the jelly optimised LSTM model is used to enhance the error metrics performance of the Approximate multiplier. The optimal hyperparameters of the LSTM model are identified by jelly search optimisation. This fine-tuning is used to obtain an optimal solution to perform an LSTM with higher accuracy. The proposed pre-trained LSTM model is used to generate approximate design libraries for the different truncation levels as a function of area, delay, power and error metrics. The experimental results on an 8-bit multiplier with an image processing application shows that the proposed approximate computing multiplier achieved a superior area and power reduction with very good results on error rates.
Journal of VLSI Signal Processing, 1993
Analog sub-threshold is an attractive microelectronic implementation approach for many applications where power is to be minimized. The process of mapping a neural network to a sub-threshold architecture requires the proper selection of a training algorithm. For analog architectures, conventional training algorithms like backpropagation have many drawbacks although they are computationally efficient on digital computers. In this article we present algorithms that are suitable for analog implementation and we present architectures and implementations of sub-threshold neural networks.
IEEE VLSI Circuits and Systems, 2018
Recent artificial neural network architectures improve performance and power dissipation by leveraging resistive devices to store and multiply synaptic weights with input data. Negative and positive synaptic weights are stored on the memristors of a reconfigurable crossbar array (MCA). Existing MCA-based neural network architectures use high power consuming voltage converters or operational amplifiers to generate the total synaptic current through each column of the crossbar array. This paper presents a low power MCA-based feedforward neural network architecture that uses a spintronic device per pair of columns to generate the synaptic current for each neuron. It is shown experimentally that the proposed architecture dissipates significantly less power compared to existing feedforward memristive neural network architectures.
2013
Neural network is being used in many real life applications like prediction, classification etc. In this paper we present an FPGA hardware efficient implementation of neural network where we have implemented a multilayer feed forward network. We have used CORDIC algorithms of HP-35 scientific calculator for this purpose. The CORDIC algorithm is an approximation technique to compute trigonometric functions. Although a minor modification enables us to realize many useful functions. We have modified these algorithms to make them suitable for binary arithmetical operations.
2010
This paper presents a digital, transistor level implemented neo-fuzzy neural network. This type of neural network is particularly well suited for real-time applications like those encountered in signal processing and nonlinear system identification. We consider in detail a flexible reconfigurable circuit of a single nonlinear synapse of this network. When combining such circuits, single-layer or multilayer networks can be designed. The advantages of the proposed circuit come in the form of reduced redundancy, high data rate due to parallel operation, low power consumption, and an overall flexibility of system configuration.
This paper explores the implementation approaches of a low power Modified Booth Multiplier (MBM) with Reduced Spurious Transition Activity Technique (RSTAT) and its application on a low power (LP) neural network. This RSTAT approach has been applied on both the compression tree of multipliers and the modified Booth Encoder to enlarge the power clampdown, for high speed and low power purposes. To filter out the spurious switching power of the multiplier, there are two approaches, one is using registers and using AND gates, to assert the data signals of LP multipliers after the data transition has been proposed. The RSTAT approach leads to a 40% power consumption reduction and speed improvement when compared with the other power minimization technique. An artificial neural network is a system consisting of small processing units (called neurons) that perform specific tasks in parallel. The hardware implementation of such neural network will mainly consist of a multiplier circuit for the product term along with an adder circuit for the summation. The above low power multiplier can be used in the neural network for low power VLSI implementations.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Sensors, 2022
arXiv (Cornell University), 2017
International Wireless Communications and Mobile Computing Conference (IWCMC), 2011
2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI), 2019
2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2019
2017 25th Signal Processing and Communications Applications Conference (SIU)
Applied Intelligence, 2000
2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)