Papers by Josep L. Rossello

arXiv (Cornell University), Jun 22, 2020
A new trans-disciplinary knowledge area, Edge Artificial Intelligence or Edge Intelligence, is be... more A new trans-disciplinary knowledge area, Edge Artificial Intelligence or Edge Intelligence, is beginning to receive a tremendous amount of interest from the machine learning community due to the ever increasing popularization of the Internet of Things (IoT). Unfortunately, the incorporation of AI characteristics to edge computing devices presents the drawbacks of being power and area hungry for typical machine learning techniques such as Convolutional Neural Networks (CNN). In this work, we propose a new power-and-area-efficient architecture for implementing Articial Neural Networks (ANNs) in hardware, based on the exploitation of correlation phenomenon in Stochastic Computing (SC) systems. The architecture purposed can solve the difficult implementation challenges that SC presents for CNN applications, such as the high resources used in binary-tostochastic conversion, the inaccuracy produced by undesired correlation between signals, and the stochastic maximum function implementation. Compared with traditional binary logic implementations, experimental results showed an improvement of 19.6x and 6.3x in terms of speed performance and energy efficiency, for the FPGA implementation. We have also realized a full VLSI implementation of the proposed SC-CNN architecture demonstrating that our optimization achieve a 18x area reduction over previous SC-DNN architecture VLSI implementation in a comparable technological node. For the first time, a fully-parallel CNN as LENET-5 is embedded and tested in a single FPGA, showing the benefits of using stochastic computing for embedded applications, in contrast to traditional binary logic implementations.
Computational Intelligence and Neuroscience, 2016
Hardware implementation of artificial neural networks (ANNs) allows exploiting the inherent paral... more Hardware implementation of artificial neural networks (ANNs) allows exploiting the inherent parallelism of these systems. Nevertheless, they require a large amount of resources in terms of area and power dissipation. Recently, Reservoir Computing (RC) has arisen as a strategic technique to design recurrent neural networks (RNNs) with simple learning capabilities. In this work, we show a new approach to implement RC systems with digital gates. The proposed method is based on the use of probabilistic computing concepts to reduce the hardware required to implement different arithmetic operations. The result is the development of a highly functional system with low hardware resources. The presented methodology is applied to chaotic time-series forecasting.
2015 25th International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS), 2015
ABSTRACT
2015 International Joint Conference on Neural Networks (IJCNN), 2015
International Journal of Neural Systems, 2015
Spiking neural networks (SNN) are the last neural network generation that try to mimic the real b... more Spiking neural networks (SNN) are the last neural network generation that try to mimic the real behavior of biological neurons. Although most research in this area is done through software applications, it is in hardware implementations in which the intrinsic parallelism of these computing systems are more efficiently exploited. Liquid state machines (LSM) have arisen as a strategic technique to implement recurrent designs of SNN with a simple learning methodology. In this work, we show a new low-cost methodology to implement high-density LSM by using Boolean gates. The proposed method is based on the use of probabilistic computing concepts to reduce hardware requirements, thus considerably increasing the neuron count per chip. The result is a highly functional system that is applied to high-speed time series forecasting.
PLOS ONE, 2015
Minimal hardware implementations able to cope with the processing of large amounts of data in rea... more Minimal hardware implementations able to cope with the processing of large amounts of data in reasonable times are highly desired in our information-driven society. In this work we review the application of stochastic computing to probabilistic-based pattern-recognition analysis of huge database sets. The proposed technique consists in the hardware implementation of a parallel architecture implementing a similarity search of data with respect to different pre-stored categories. We design pulse-based stochastic-logic blocks to obtain an efficient pattern recognition system. The proposed architecture speeds up the screening process of huge databases by a factor of 7 when compared to a conventional digital implementation using the same hardware area.

IEEE Transactions on Neural Networks and Learning Systems, 2015
This paper presents a new methodology for the hardware implementation of neural networks (NNs) ba... more This paper presents a new methodology for the hardware implementation of neural networks (NNs) based on probabilistic laws. The proposed encoding scheme circumvents the limitations of classical stochastic computing (based on unipolar or bipolar encoding) extending the representation range to any real number using the ratio of two bipolar-encoded pulsed signals. Furthermore, the novel approach presents practically a total noise-immunity capability due to its specific codification. We introduce different designs for building the fundamental blocks needed to implement NNs. The validity of the present approach is demonstrated through a regression and a pattern recognition task. The low cost of the methodology in terms of hardware, along with its capacity to implement complex mathematical functions (such as the hyperbolic tangent), allows its use for building highly reliable systems and parallel computing.
IEICE Proceeding Series, 2014
Lecture Notes in Computer Science, 2006
Abstract. We present a novel technique to accurately describe the leakage power in CMOS nanometer... more Abstract. We present a novel technique to accurately describe the leakage power in CMOS nanometer Integrated Circuits (ICs) considering process variations. The model predicts a leakage power increment due to process variations with high accuracy. It is shown that leakage increases considerably as channel length variations become larger due to technology scaling. The present work also describes accurately the dependence of leakage dispersion with process variations. The model developed shows that, even if channel ...
Lecture Notes in Computer Science, 2005
We present a compact model to estimate quickly and accurately the leakage power in CMOS nanometer... more We present a compact model to estimate quickly and accurately the leakage power in CMOS nanometer Integrated Circuits (ICs). The model has similar accuracy than SPICE and represents an important improvement with respect to previous works. It has been developed to be used for fast and accurate estimation and optimization of the standby power dissipated by large circuits.

International Journal of Neural Systems, 2014
The brain is characterized by performing many diverse processing tasks ranging from elaborate pro... more The brain is characterized by performing many diverse processing tasks ranging from elaborate processes such as pattern recognition, memory or decision making to more simple functionalities such as linear filtering in image processing. Understanding the mechanisms by which the brain is able to produce such a different range of cortical operations remains a fundamental problem in neuroscience. Here we show a study about which processes are related to chaotic and synchronized states based on the study of in-silico implementation of Stochastic Spiking Neural Networks (SSNN). The measurements obtained reveal that chaotic neural ensembles are excellent transmission and convolution systems since mutual information between signals is minimized. At the same time, synchronized cells (that can be understood as ordered states of the brain) can be associated to more complex nonlinear computations. In this sense, we experimentally show that complex and quick pattern recognition processes arise when both synchronized and chaotic states are mixed. These measurements are in accordance with in vivo observations related to the role of neural synchrony in pattern recognition and to the speed of the real biological process. We also suggest that the high-level adaptive mechanisms of the brain that are the Hebbian and non-Hebbian learning rules can be understood as processes devoted to generate the appropriate clustering of both synchronized and chaotic ensembles. The measurements obtained from the hardware implementation of different types of neural systems suggest that the brain processing can be governed by the superposition of these two complementary states with complementary functionalities (nonlinear processing for synchronized states and information convolution and parallelization for chaotic).
Lecture Notes in Computer Science, 2003
In this work we propose a compact analytical model to compute the crosstalk induced delay from a ... more In this work we propose a compact analytical model to compute the crosstalk induced delay from a charge-based propagation delay model for submicronic CMOS gates. Crosstalk delay is described as an additional charge to be transferred through the pMOS (nMOS) network of the gate driving the victim node during its rising (falling) output transition. The model accounts for time skew between the victim and aggressor input transitions and includes submicronic effects. It provides an intuitive description of crosstalk delay showing very ...
Lecture Notes in Computer Science, 2002
We provide an accurate analytical expression for the propagation delayand the output transition t... more We provide an accurate analytical expression for the propagation delayand the output transition time of submicron CMOS buffers that takes into account the short-circuit current, the input-output coupling capacitance, and the carrier velocitysaturation effects, of increasing importance in deep-submicron technologies. The model is based on the nth-power law MOSFET model and computes the propagation delayfrom the charge delivered to the gate. Comparison with HSPICE level 50 simulations and other previouslypublished ...
Nature Precedings, 2012
The brain is characterized by performing many different processing tasks ranging from elaborate p... more The brain is characterized by performing many different processing tasks ranging from elaborate processes as pattern recognition, memory or decision-making to more simple functionalities as linear filtering in image processing. Understanding the mechanisms by which the brain is able to produce such a different range of cortical operations remains a fundamental problem in neuroscience. Some recent empirical and theoretical results support the notion that the brain is naturally poised near critically between ordered and chaotic states. As the largest number of metastable states exists at a point near the transition, the brain can therefore access to a larger repertoire of behaviours.
Lecture Notes in Computer Science, 2009
Abstract. This work provides practical guidelines for an efficient hardware implementation of Neu... more Abstract. This work provides practical guidelines for an efficient hardware implementation of Neural Networks. Networks are configured using a practical self-learning architecture that iterates a basic Genetic Algorithm. The learning methodology is based on the generation of random vectors that can be extracted from chaotic signals. The proposed solution is applied to estimate the processing efficiency of Spiking Neural Networks. Keywords: Neural Networks, Spiking Neural Networks, Hardware implementation of Genetic Algorithms.
Lecture Notes in Computer Science, 2009
Abstract. In this work we provide design guidelines for the hardware implementation of Spiking Ne... more Abstract. In this work we provide design guidelines for the hardware implementation of Spiking Neural Networks. The proposed methodology is applied to temporal pattern recognition analysis. For this purpose the networks are trained using a simplified Genetic Algorithm. The proposed solution is applied to estimate the processing efficiency of Spiking Neural Networks. Keywords: Neural Networks, Spiking Neural Networks, Hardware implementation of Genetic Algorithms.
Test Conference, …, 2004
Index Terms-Clock skew, clock distribution networks, temperature gradient, interconnect delay.
International Journal of …, Jan 1, 2009
A new design of Spiking Neural Networks is proposed and fabricated using a 0.35 µm CMOS technolog... more A new design of Spiking Neural Networks is proposed and fabricated using a 0.35 µm CMOS technology. The architecture is based on the use of both digital and analog circuitry. The digital circuitry is dedicated to the inter-neuron communication while the analog part implements the internal non-linear behavior associated to spiking neurons. The main advantages of the proposed system are the small area of integration with respect to digital solutions, its implementation using a standard CMOS process only and the reliability of the inter-neuron communication.
Pattern Recognition Letters, Jan 1, 2010
In this work we review the basic principles of stochastic logic and propose its application to pr... more In this work we review the basic principles of stochastic logic and propose its application to probabilisticbased pattern-recognition analysis. The proposed technique is intrinsically a parallel comparison of input data to various pre-stored categories using Bayesian techniques. We design smart pulse-based stochasticlogic blocks to provide an efficient pattern-recognition analysis. The proposed architecture is applied to a specific navigation problem.
Neural Networks (IJCNN), …, Jan 1, 2010
This paper provides practical guidelines for an efficient hardware implementation of Neural Netwo... more This paper provides practical guidelines for an efficient hardware implementation of Neural Networks. Networks are configured using a practical self-learning architecture that iterates a basic Genetic Algorithm. The learning methodology is based on the generation of random vectors that can be extracted from chaotic signals. The proposed solution is applied to estimate the processing efficiency of Spiking Neural Networks.
Uploads
Papers by Josep L. Rossello