Skip to main content

ABDULLAH HANIF

Kyrgyz State Medical Academy, General medicine, Undergraduate

Followers

10

Following

1

Public Views

Noel B. Salazar

KU Leuven

School of the Art Institute of Chicago

University Of Kolkata

National University of "Kyiv-Mohyla Academy"

Armando Marques-Guedes

UNL - New University of Lisbon

Università di Bologna

The University of Newcastle

Trias Mahmudiono

Universitas Airlangga

James M. Lepkowski

University of Michigan

LUISS Guido Carli

Interests

Uploads

Papers by ABDULLAH HANIF

HW/SW co-design and co-optimizations for deep learning

Proceedings of the Workshop on INTelligent Embedded Systems Architectures and Applications

Deep Learning algorithms have been proven to provide state-of-the-art results in many application... more Deep Learning algorithms have been proven to provide state-of-the-art results in many applications but at the cost of a high computational complexity. Therefore, accelerating such algorithms in hardware is highly needed. However, since the computational requirements are growing exponentially along with the accuracy, their demand for hardware resources is significant. To tackle this issue, we propose a methodology, involving both software and hardware, to optimize the Deep Neural Networks (DNNs). We discuss and analyze pruning, approximations through quantization and specialized accelerators for DNN inference. For each phase of the methodology, we provide quantitative comparisons with the existing techniques and hardware platforms.

QuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Adversarial Attacks

2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design (IOLTS)

Adversarial examples have emerged as a significant threat to machine learning algorithms, especia... more Adversarial examples have emerged as a significant threat to machine learning algorithms, especially to the convolutional neural networks (CNNs). In this paper, we propose two quantization-based defense mechanisms, Constant Quantization (CQ) and Trainable Quantization (TQ), to increase the robustness of CNNs against adversarial examples. CQ quantizes input pixel intensities based on a "fixed" number of quantization levels, while in TQ, the quantization levels are "iteratively learned during the training phase", thereby providing a stronger defense mechanism. We apply the proposed techniques on undefended CNNs against different state-of-the-art adversarial attacks from the open-source Cleverhans library. The experimental results demonstrate 50%-96% and 10%-50% increase in the classification accuracy of the perturbed images generated from the MNIST and the CIFAR-10 datasets, respectively, on commonly used CNN (Conv2D(64, 8x8)-Conv2D(128, 6x6)-Conv2D(128, 5x5)-Dense(10)-Softmax()) available in Cleverhans library.

Configurable Models and Design Space Exploration for Low-Latency Approximate Adders

Approximate Circuits

Addition is one of the most commonly used operations in almost all the data processing-related ap... more Addition is one of the most commonly used operations in almost all the data processing-related applications. High-performance adders have become significantly common for applications that require low latency and/or high throughput. One of the common types of such adders, which has proven to be highly effective for improving the latency of the systems, is fast/parallel-prefix adders. While these adders can provide effective performance benefits, they do introduce significant power and area overhead due to the requirement of parallel carry generation logic. Coincidentally, most of the applications which involve intensive data processing are somewhat resilient to errors and therefore can leverage the concepts of approximate computing to achieve significant performance improvements [1-4]. Several high-performance approximate adders have been proposed, for example, ETA-II [5], ETA-IIM [5], ACA [6, 7], GDA [8], etc. that improve the performance of adder blocks beyond that of the conventional accurate designs. Each approximate low-latency adder has its own unique error, performance, area, and power characteristics and, therefore, are suitable for different scenarios. Almost all such adders can be categorized under the umbrella of block-based adders, as they employ smaller sub-adder units/blocks which operate in parallel to compute the resultant bits of the output. A few example approximate low-latency adders are shown in Fig. 1.1.

A cross-layer approach towards developing efficient embedded Deep Learning systems

Microprocessors and Microsystems

Exploiting Vulnerabilities in Deep Neural Networks: Adversarial and Fault-Injection Attacks

ArXiv, 2021

From tiny pacemaker chips to aircraft collision avoidance systems, the state-of-the-art Cyber-Phy... more From tiny pacemaker chips to aircraft collision avoidance systems, the state-of-the-art Cyber-Physical Systems (CPS) have increasingly started to rely on Deep Neural Networks (DNNs). However, as concluded in various studies, DNNs are highly susceptible to security threats, including adversarial attacks. In this paper, we first discuss different vulnerabilities that can be exploited for generating security attacks for neural network-based systems. We then provide an overview of existing adversarial and fault-injection-based attacks on DNNs. We also present a brief analysis to highlight different challenges in the practical implementation of adversarial attacks. Finally, we also discuss various prospective ways to develop robust DNN-based systems that are resilient to adversarial and fault-injection attacks.

Sejarah kebudayaan Islam MTs kelas 3

FANNet: Formal Analysis of Noise Tolerance, Training Bias and Input Sensitivity in Neural Networks

2020 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2020

With a constant improvement in the network architectures and training methodologies, Neural Netwo... more With a constant improvement in the network architectures and training methodologies, Neural Networks (NNs) are increasingly being deployed in real-world Machine Learning systems. However, despite their impressive performance on "known inputs", these NNs can fail absurdly on the "unseen inputs", especially if these real-time inputs deviate from the training dataset distributions, or contain certain types of input noise. This indicates the low noise tolerance of NNs, which is a major reason for the recent increase of adversarial attacks. This is a serious concern, particularly for safety-critical applications, where inaccurate results lead to dire consequences. We propose a novel methodology that leverages model checking for the Formal Analysis of Neural Network (FANNet) under different input noise ranges. Our methodology allows us to rigorously analyze the noise tolerance of NNs, their input node sensitivity, and the effects of training bias on their performance, ...

TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation

IEEE Access

Convolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stri... more Convolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations in convolution layers. In order to reduce the computational workload in these layers, this paper proposes a hybrid convolution method in conjunction with a Particle of Swarm Convolution Layer Optimization (PSCLO) algorithm. The hybrid convolution is an approximation that exploits the inherent symmetry of filter termed as symmetry approximation and Winograd algorithm structure termed as tile quantization approximation. PSCLO optimizes the balance between workload reduction and accuracy degradation for each convolution layer by selecting fine-tuned thresholds to control each approximation's intensity. The proposed methods have been evaluated on ImageNet, MNIST, Fashion-MNIST, SVHN, and CIFAR-10 datasets. The proposed techniques achieved ∼5.28x multiplicative workload reduction without significant accuracy degradation (<0.1%) for ImageNet on ResNet-18, which is ∼1.08x less multiplicative workload as compared to state-of-the-art Winograd CNN pruning. For LeNet, ∼3.87x and ∼3.93x was the multiplicative workload reduction for MNIST and Fashion-MNIST datasets. The additive workload reduction was ∼2.5x and ∼2.56x for the respective datasets. There is no significant accuracy loss for MNIST and Fashion-MNIST dataset.

$Approximation of 5-by-5 filter: For finding symmetric and anti-symmetry coefficients, two possible configurations are considered. Fig.12 shows the kernel masks for adjacent and opposite sides, whereas the red cells correspond to -1 and the blue ones correspond to 1. White cells express zero values. For maximum symmetry/anti-symmetry, it is necessary to check the possibilities of adjacent and opposite side sym- metry and anti-symmetry at the rotated intervals: 0, 90, 180, and 270. Equation 9 and 10 show symmetric W and anti-symmetric W’ coefficient, whereas Equation 7 and 8 represent point- wise multiplication of the masks My ,, and M-"\,, which are shown in the image in Fig. 12. Further, ® and ®’ denote a multiplicative and accumulative result for symmetry and anti- symmetry, respectively.$

Pembuatan Aplikasi Multimedia Informasi pada Kios Informasi Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Sebelas Maret Surakarta

INTISARIAbdullah Hanif. 2018. “Pembuatan Aplikasi Multimedia Informasi Pada Kios Informasi... more INTISARIAbdullah Hanif. 2018. “Pembuatan Aplikasi Multimedia Informasi Pada Kios Informasi Fakultas Matematika Dan Ilmu Pengetahuan Alam Universitas Sebelas Maret”. Program Diploma III Teknik Informatika Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Sebelas Maret Surakarta.Fakultas Matematika dan Ilmu Pengetahuan Alam telah merupaya untuk menyediakan papan informasi yang dapat digunakan oleh civitas akademik untuk menyampaikan informasi ataupun juga menyampaikan informasi, namun papan informasi yang ada pada saat ini secara umum pengelolaan dan penyampaian masih kurang optimal oleh sebab itu ketertarikan civitas akademik Fakultas Matematika dan Ilmu Pengetahuan Alam dalam mendapatkan informasi belum tepat sasaran.Untuk pengembangan aplikasi ini menggunakan metode pengembangan multimedia menurut lether yang terdiri dari konsep, perancangan, pengumpulan bahan, pembuatan, testing dan distribusi. Dalam aplikasi ini terdapat 4 rancangan interfase: rancanga...

A survey of hardware architectures for generative adversarial networks

J. Syst. Archit., 2021

Recent years have witnessed a significant interest in the ``generative adversarial networks&#... more Recent years have witnessed a significant interest in the ``generative adversarial networks&#39;&#39; (GANs) due to their ability to generate high-fidelity data. Many models of GANs have been proposed for a diverse range of domains ranging from natural language processing to image processing. GANs have a high compute and memory requirements. Also, since they involve both convolution and deconvolution operation, they do not map well to the conventional accelerators designed for convolution operations. Evidently, there is a need of customized accelerators for achieving high efficiency with GANs. In this work, we present a survey of techniques and architectures for accelerating GANs. We organize the works on key parameters to bring out their differences and similarities. Finally, we present research challenges that are worthy of attention in near future. More than summarizing the state-of-art, this survey seeks to spark further research in the field of GAN accelerators.

DNN-Life: An Energy-Efficient Aging Mitigation Framework for Improving the Lifetime of On-Chip Weight Memories in Deep Neural Network Hardware Architectures

2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)

Negative Biased Temperature Instability (NBTI)-induced aging is one of the critical reliability t... more Negative Biased Temperature Instability (NBTI)-induced aging is one of the critical reliability threats in nano-scale devices. This paper makes the first attempt to study the NBTI aging in the on-chip weight memories of deep neural network (DNN) hardware accelerators, subjected to complex DNN workloads. We propose DNN-Life, a specialized aging analysis and mitigation framework for DNNs, which jointly exploits hardware-and software-level knowledge to improve the lifetime of a DNN weight memory with reduced energy overhead. At the softwarelevel, we analyze the effects of different DNN quantization methods on the distribution of the bits of weight values. Based on the insights gained from this analysis, we propose a micro-architecture that employs low-cost memory-write (and read) transducers to achieve an optimal duty-cycle at run time in the weight memory cells, thereby balancing their aging. As a result, our DNN-Life framework enables efficient aging mitigation of weight memory of the given DNN hardware at minimal energy overhead during the inference process.

Hardware–Software Approximations for Deep Neural Networks

Approximate Circuits

Neural networks (NNs) are the state of the art for many artificial intelligence (AI) applications... more Neural networks (NNs) are the state of the art for many artificial intelligence (AI) applications. However, in order to facilitate the training process, most of the neural networks are over-parameterized and result in significant computational and memory overheads. Therefore, to alleviate the computational and memory requirements of these NNs, numerous optimization techniques have been proposed. In this chapter, we highlight one of the prominent paradigms, i.e., approximate computing, that can significantly improve the resource requirements of these networks. We describe a sensitivity analysis methodology for estimating the significance sub-parts of the state-of-the-art NNs. Based upon the significance analysis, we then present a methodology for employing tolerable amount of approximations at various stages of the network, i.e., removal of ineffectual filters/neurons at the software layer and precision reduction and memory approximations at the hardware layer. Towards the end of this chapter, we also highlight few of the prominent challenges in adopting different types of approximation and the effects that they have on the overall efficiency and accuracy of the baseline networks.

Dependable Deep Learning: Towards Cost-Efficient Resilience of Deep Neural Network Accelerators against Soft Errors and Permanent Faults

2020 IEEE 26th International Symposium on On-Line Testing and Robust System Design (IOLTS), 2020

Deep Learning has enabled machines to learn computational models (i.e., Deep Neural Networks – DN... more Deep Learning has enabled machines to learn computational models (i.e., Deep Neural Networks – DNNs) that can perform certain complex tasks with claims to be close to human-level precision. This state-of-the-art performance offered by DNNs in many Artificial Intelligence (AI) applications has paved their way to being used in several safety-critical applications where even a single failure can lead to catastrophic results. Therefore, improving the robustness of these models to hardware-induced faults (such as soft errors, aging, and manufacturing defects) is of significant importance to avoid any disastrous event. Traditional redundancy-based fault mitigation techniques cannot be employed in a wide of applications due to their high overheads, which, when coupled with the compute-intensive nature of DNNs, lead to undesirable resource consumption. In this article, we present an overview of different low-cost fault-mitigation techniques that exploit the intrinsic characteristics of DNNs...

QuAd: Design and analysis of Quality-area optimal Low-Latency approximate Adders

2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC), 2017

Approximate circuits exploit error resilience property of applications to tradeoff computation qu... more Approximate circuits exploit error resilience property of applications to tradeoff computation quality (accuracy) for gaining advantage in terms of performance, power, and/or area. While state-of-the-art low-latency approximate adders provide an accuracy-area-latency configurable design space, the selection of a particular configuration from the design space is still manually done. In this paper, we analytically analyze different structural properties of low-latency approximate adders to formulate a new adder model, Quality-area optimal Low-Latency approximate Adder (QuAd). It provides an increased design space as compared to state-of-the-art, providing design points that require less logic area for the same accuracy, as compared to state-of-the-art approximate adders. Furthermore, based upon our mathematical analysis, we show that, provided a latency constraint, an adder configuration with the highest quality and lowest area requirement can effortlessly be selected from the whole d...

ReSpawn: Energy-Efficient Fault-Tolerance for Spiking Neural Networks considering Unreliable Memories

Spiking neural networks (SNNs) have shown a potential for having low energy with unsupervised lea... more Spiking neural networks (SNNs) have shown a potential for having low energy with unsupervised learning capabilities due to their biologically-inspired computation. However, they may suffer from accuracy degradation if their processing is performed under the presence of hardware-induced faults in memories, which can come from manufacturing defects or voltage-induced approximation errors. Since recent works still focus on the fault-modeling and random fault injection in SNNs, the impact of memory faults in SNN hardware architectures on accuracy and the respective fault-mitigation techniques are not thoroughly explored. Toward this, we propose ReSpawn, a novel framework for mitigating the negative impacts of faults in both the off-chip and on-chip memories for resilient and energy-efficient SNNs. The key mechanisms of ReSpawn are: (1) analyzing the fault tolerance of SNNs; and (2) improving the SNN fault tolerance through (a) fault-aware mapping (FAM) in memories, and (b) fault-aware t...

Masa depan pesantren : dalam tantangan modernitas dan tantangan kompleksitas global

Paper Title (use style: paper title)

The exponential increase in dependencies between the cyber and physical world leads to an enormou... more The exponential increase in dependencies between the cyber and physical world leads to an enormous amount of data which must be efficiently processed and stored. Therefore, computing paradigms are evolving towards machine learning (ML)-based systems because of their ability to efficiently and accurately process the enormous amount of data. Although MLbased solutions address the efficient computing requirements of big data, they introduce (new) security vulnerabilities into the systems, which cannot be addressed by traditional monitoringbased security measures. Therefore, this paper first presents a brief overview of various security threats in machine learning, their respective threat models and associated research challenges to develop robust security measures. To illustrate the security vulnerabilities of ML during training, inferencing and hardware implementation, we demonstrate some key security threats on ML using LeNet and VGGNet for MNIST and German Traffic Sign Recognition B...

Perancangan Sistem Pengenalan Suara Sebagai Pengendali Laptop Berbasis Arduino Uno

Perkembangan teknologi yang semakin maju memberikan banyak manfaat kepada kehidupan sehari-hari, ... more Perkembangan teknologi yang semakin maju memberikan banyak manfaat kepada kehidupan sehari-hari, salah satu hasil inovasi dari kemajuan perkembangan teknologi ialah perintah suara yang dapat memungkinkan pengguna mengendalikan perangkat elektronik miliknya hanya dengan menggunakan perintah suara untuk menyalakan ataupun mematikan perangkat elektronik tersebut. Suara digital yang yang diucapkan akan diolah dan dikontrol dengan sistem untuk mengenali perintah suara yang terdeteksi. Perancangan sistem pengenalan suara ini bertujuan agar dapat memberikan kemudahan pengguna dalam mengoperasikan laptop dengan menggunakan suara. Perancangan sistem pengenalan suara berbasis Arduino Uno menggunakan modul EasyVR sebagai modul pengenalan suara serta michrophone wireless agar perintah yang diucapkan dapat dilakukan dari jarak yang jauh dari laptop. Hasil penelitian ini diharapkan dapat menjadi sebuah prototype sistem pengenalan suara untuk menghidupkan atau mematikan laptop dengan perintah suar...

Weight Quantization Retraining for Sparse and Compressed Spatial Domain Correlation Filters

Using Spatial Domain Correlation Pattern Recognition (CPR) in Internet-of-Things (IoT)-based appl... more Using Spatial Domain Correlation Pattern Recognition (CPR) in Internet-of-Things (IoT)-based applications often faces constraints, like inadequate computational resources and limited memory. To reduce the computation workload of inference due to large spatial-domain CPR filters and convert filter weights into hardware-friendly data-types, this paper introduces the power-of-two (Po2) and dynamic-fixed-point (DFP) quantization techniques for weight compression and the sparsity induction in filters. Weight quantization re-training (WQR), the log-polar, and the inverse log-polar geometric transformations are introduced to reduce quantization error. WQR is a method of retraining the CPR filter, which is presented to recover the accuracy loss. It forces the given quantization scheme by adding the quantization error in the training sample and then re-quantizes the filter to the desired quantization levels which reduce quantization noise. Further, Particle Swarm Optimization (PSO) is used t...

Approximate computing across the hardware and software stacks

Many-Core Computing: Hardware and Software

Emerging fields like big data and IoT have brought a number of challenges for hardware as well as... more Emerging fields like big data and IoT have brought a number of challenges for hardware as well as software design community. Some of the major challenges are to scale the computational and memory resources and the efficiency of the processing devices as per the growing needs. In the past few years, a number of fields have emerged for addressing these challenges. We focus on one of the prominent paradigms that have the potential to improve the resource efficiency regardless of the underlying technology, i.e., approximate computing (AC). AC aims at relaxing the bounds of exact computing to provide new opportunities for achieving gains in terms of energy, power, performance, and/or area efficiency at the cost of reduced output quality, typically within the tolerable range. We first provide an overview of AC and the techniques which are commonly being employed at different abstraction levels for alleviating the resource requirements of computationally intensive applications. Afterwards, a detailed discussion on component-level approximations and their probabilistic behavior by considering approximate adders and multipliers is presented. At the next step, a methodology used to construct efficient accelerators from these components will be discussed. The discussion will then be extended to approximate memories and runtime management systems. Toward the end of the chapter, we present a methodology for designing energy efficient many-core systems based upon approximate components followed by the challenges in adopting a cross-layer approach for designing highly energy, power, and performance-efficient systems.

HW/SW co-design and co-optimizations for deep learning

Proceedings of the Workshop on INTelligent Embedded Systems Architectures and Applications

Deep Learning algorithms have been proven to provide state-of-the-art results in many application... more Deep Learning algorithms have been proven to provide state-of-the-art results in many applications but at the cost of a high computational complexity. Therefore, accelerating such algorithms in hardware is highly needed. However, since the computational requirements are growing exponentially along with the accuracy, their demand for hardware resources is significant. To tackle this issue, we propose a methodology, involving both software and hardware, to optimize the Deep Neural Networks (DNNs). We discuss and analyze pruning, approximations through quantization and specialized accelerators for DNN inference. For each phase of the methodology, we provide quantitative comparisons with the existing techniques and hardware platforms.

QuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Adversarial Attacks

2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design (IOLTS)

Adversarial examples have emerged as a significant threat to machine learning algorithms, especia... more Adversarial examples have emerged as a significant threat to machine learning algorithms, especially to the convolutional neural networks (CNNs). In this paper, we propose two quantization-based defense mechanisms, Constant Quantization (CQ) and Trainable Quantization (TQ), to increase the robustness of CNNs against adversarial examples. CQ quantizes input pixel intensities based on a "fixed" number of quantization levels, while in TQ, the quantization levels are "iteratively learned during the training phase", thereby providing a stronger defense mechanism. We apply the proposed techniques on undefended CNNs against different state-of-the-art adversarial attacks from the open-source Cleverhans library. The experimental results demonstrate 50%-96% and 10%-50% increase in the classification accuracy of the perturbed images generated from the MNIST and the CIFAR-10 datasets, respectively, on commonly used CNN (Conv2D(64, 8x8)-Conv2D(128, 6x6)-Conv2D(128, 5x5)-Dense(10)-Softmax()) available in Cleverhans library.

Configurable Models and Design Space Exploration for Low-Latency Approximate Adders

Approximate Circuits

Addition is one of the most commonly used operations in almost all the data processing-related ap... more Addition is one of the most commonly used operations in almost all the data processing-related applications. High-performance adders have become significantly common for applications that require low latency and/or high throughput. One of the common types of such adders, which has proven to be highly effective for improving the latency of the systems, is fast/parallel-prefix adders. While these adders can provide effective performance benefits, they do introduce significant power and area overhead due to the requirement of parallel carry generation logic. Coincidentally, most of the applications which involve intensive data processing are somewhat resilient to errors and therefore can leverage the concepts of approximate computing to achieve significant performance improvements [1-4]. Several high-performance approximate adders have been proposed, for example, ETA-II [5], ETA-IIM [5], ACA [6, 7], GDA [8], etc. that improve the performance of adder blocks beyond that of the conventional accurate designs. Each approximate low-latency adder has its own unique error, performance, area, and power characteristics and, therefore, are suitable for different scenarios. Almost all such adders can be categorized under the umbrella of block-based adders, as they employ smaller sub-adder units/blocks which operate in parallel to compute the resultant bits of the output. A few example approximate low-latency adders are shown in Fig. 1.1.

A cross-layer approach towards developing efficient embedded Deep Learning systems

Microprocessors and Microsystems

Exploiting Vulnerabilities in Deep Neural Networks: Adversarial and Fault-Injection Attacks

ArXiv, 2021

From tiny pacemaker chips to aircraft collision avoidance systems, the state-of-the-art Cyber-Phy... more From tiny pacemaker chips to aircraft collision avoidance systems, the state-of-the-art Cyber-Physical Systems (CPS) have increasingly started to rely on Deep Neural Networks (DNNs). However, as concluded in various studies, DNNs are highly susceptible to security threats, including adversarial attacks. In this paper, we first discuss different vulnerabilities that can be exploited for generating security attacks for neural network-based systems. We then provide an overview of existing adversarial and fault-injection-based attacks on DNNs. We also present a brief analysis to highlight different challenges in the practical implementation of adversarial attacks. Finally, we also discuss various prospective ways to develop robust DNN-based systems that are resilient to adversarial and fault-injection attacks.

Sejarah kebudayaan Islam MTs kelas 3

FANNet: Formal Analysis of Noise Tolerance, Training Bias and Input Sensitivity in Neural Networks

2020 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2020

With a constant improvement in the network architectures and training methodologies, Neural Netwo... more With a constant improvement in the network architectures and training methodologies, Neural Networks (NNs) are increasingly being deployed in real-world Machine Learning systems. However, despite their impressive performance on "known inputs", these NNs can fail absurdly on the "unseen inputs", especially if these real-time inputs deviate from the training dataset distributions, or contain certain types of input noise. This indicates the low noise tolerance of NNs, which is a major reason for the recent increase of adversarial attacks. This is a serious concern, particularly for safety-critical applications, where inaccurate results lead to dire consequences. We propose a novel methodology that leverages model checking for the Formal Analysis of Neural Network (FANNet) under different input noise ranges. Our methodology allows us to rigorously analyze the noise tolerance of NNs, their input node sensitivity, and the effects of training bias on their performance, ...

TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation

IEEE Access

Convolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stri... more Convolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations in convolution layers. In order to reduce the computational workload in these layers, this paper proposes a hybrid convolution method in conjunction with a Particle of Swarm Convolution Layer Optimization (PSCLO) algorithm. The hybrid convolution is an approximation that exploits the inherent symmetry of filter termed as symmetry approximation and Winograd algorithm structure termed as tile quantization approximation. PSCLO optimizes the balance between workload reduction and accuracy degradation for each convolution layer by selecting fine-tuned thresholds to control each approximation's intensity. The proposed methods have been evaluated on ImageNet, MNIST, Fashion-MNIST, SVHN, and CIFAR-10 datasets. The proposed techniques achieved ∼5.28x multiplicative workload reduction without significant accuracy degradation (<0.1%) for ImageNet on ResNet-18, which is ∼1.08x less multiplicative workload as compared to state-of-the-art Winograd CNN pruning. For LeNet, ∼3.87x and ∼3.93x was the multiplicative workload reduction for MNIST and Fashion-MNIST datasets. The additive workload reduction was ∼2.5x and ∼2.56x for the respective datasets. There is no significant accuracy loss for MNIST and Fashion-MNIST dataset.

$Approximation of 5-by-5 filter: For finding symmetric and anti-symmetry coefficients, two possible configurations are considered. Fig.12 shows the kernel masks for adjacent and opposite sides, whereas the red cells correspond to -1 and the blue ones correspond to 1. White cells express zero values. For maximum symmetry/anti-symmetry, it is necessary to check the possibilities of adjacent and opposite side sym- metry and anti-symmetry at the rotated intervals: 0, 90, 180, and 270. Equation 9 and 10 show symmetric W and anti-symmetric W’ coefficient, whereas Equation 7 and 8 represent point- wise multiplication of the masks My ,, and M-"\,, which are shown in the image in Fig. 12. Further, ® and ®’ denote a multiplicative and accumulative result for symmetry and anti- symmetry, respectively.$

Pembuatan Aplikasi Multimedia Informasi pada Kios Informasi Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Sebelas Maret Surakarta

INTISARIAbdullah Hanif. 2018. “Pembuatan Aplikasi Multimedia Informasi Pada Kios Informasi... more INTISARIAbdullah Hanif. 2018. “Pembuatan Aplikasi Multimedia Informasi Pada Kios Informasi Fakultas Matematika Dan Ilmu Pengetahuan Alam Universitas Sebelas Maret”. Program Diploma III Teknik Informatika Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Sebelas Maret Surakarta.Fakultas Matematika dan Ilmu Pengetahuan Alam telah merupaya untuk menyediakan papan informasi yang dapat digunakan oleh civitas akademik untuk menyampaikan informasi ataupun juga menyampaikan informasi, namun papan informasi yang ada pada saat ini secara umum pengelolaan dan penyampaian masih kurang optimal oleh sebab itu ketertarikan civitas akademik Fakultas Matematika dan Ilmu Pengetahuan Alam dalam mendapatkan informasi belum tepat sasaran.Untuk pengembangan aplikasi ini menggunakan metode pengembangan multimedia menurut lether yang terdiri dari konsep, perancangan, pengumpulan bahan, pembuatan, testing dan distribusi. Dalam aplikasi ini terdapat 4 rancangan interfase: rancanga...

A survey of hardware architectures for generative adversarial networks

J. Syst. Archit., 2021

Recent years have witnessed a significant interest in the ``generative adversarial networks&#... more Recent years have witnessed a significant interest in the ``generative adversarial networks&#39;&#39; (GANs) due to their ability to generate high-fidelity data. Many models of GANs have been proposed for a diverse range of domains ranging from natural language processing to image processing. GANs have a high compute and memory requirements. Also, since they involve both convolution and deconvolution operation, they do not map well to the conventional accelerators designed for convolution operations. Evidently, there is a need of customized accelerators for achieving high efficiency with GANs. In this work, we present a survey of techniques and architectures for accelerating GANs. We organize the works on key parameters to bring out their differences and similarities. Finally, we present research challenges that are worthy of attention in near future. More than summarizing the state-of-art, this survey seeks to spark further research in the field of GAN accelerators.

DNN-Life: An Energy-Efficient Aging Mitigation Framework for Improving the Lifetime of On-Chip Weight Memories in Deep Neural Network Hardware Architectures

2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)

Negative Biased Temperature Instability (NBTI)-induced aging is one of the critical reliability t... more Negative Biased Temperature Instability (NBTI)-induced aging is one of the critical reliability threats in nano-scale devices. This paper makes the first attempt to study the NBTI aging in the on-chip weight memories of deep neural network (DNN) hardware accelerators, subjected to complex DNN workloads. We propose DNN-Life, a specialized aging analysis and mitigation framework for DNNs, which jointly exploits hardware-and software-level knowledge to improve the lifetime of a DNN weight memory with reduced energy overhead. At the softwarelevel, we analyze the effects of different DNN quantization methods on the distribution of the bits of weight values. Based on the insights gained from this analysis, we propose a micro-architecture that employs low-cost memory-write (and read) transducers to achieve an optimal duty-cycle at run time in the weight memory cells, thereby balancing their aging. As a result, our DNN-Life framework enables efficient aging mitigation of weight memory of the given DNN hardware at minimal energy overhead during the inference process.

Hardware–Software Approximations for Deep Neural Networks

Approximate Circuits

Neural networks (NNs) are the state of the art for many artificial intelligence (AI) applications... more Neural networks (NNs) are the state of the art for many artificial intelligence (AI) applications. However, in order to facilitate the training process, most of the neural networks are over-parameterized and result in significant computational and memory overheads. Therefore, to alleviate the computational and memory requirements of these NNs, numerous optimization techniques have been proposed. In this chapter, we highlight one of the prominent paradigms, i.e., approximate computing, that can significantly improve the resource requirements of these networks. We describe a sensitivity analysis methodology for estimating the significance sub-parts of the state-of-the-art NNs. Based upon the significance analysis, we then present a methodology for employing tolerable amount of approximations at various stages of the network, i.e., removal of ineffectual filters/neurons at the software layer and precision reduction and memory approximations at the hardware layer. Towards the end of this chapter, we also highlight few of the prominent challenges in adopting different types of approximation and the effects that they have on the overall efficiency and accuracy of the baseline networks.

Dependable Deep Learning: Towards Cost-Efficient Resilience of Deep Neural Network Accelerators against Soft Errors and Permanent Faults

2020 IEEE 26th International Symposium on On-Line Testing and Robust System Design (IOLTS), 2020

Deep Learning has enabled machines to learn computational models (i.e., Deep Neural Networks – DN... more Deep Learning has enabled machines to learn computational models (i.e., Deep Neural Networks – DNNs) that can perform certain complex tasks with claims to be close to human-level precision. This state-of-the-art performance offered by DNNs in many Artificial Intelligence (AI) applications has paved their way to being used in several safety-critical applications where even a single failure can lead to catastrophic results. Therefore, improving the robustness of these models to hardware-induced faults (such as soft errors, aging, and manufacturing defects) is of significant importance to avoid any disastrous event. Traditional redundancy-based fault mitigation techniques cannot be employed in a wide of applications due to their high overheads, which, when coupled with the compute-intensive nature of DNNs, lead to undesirable resource consumption. In this article, we present an overview of different low-cost fault-mitigation techniques that exploit the intrinsic characteristics of DNNs...

QuAd: Design and analysis of Quality-area optimal Low-Latency approximate Adders

2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC), 2017

Approximate circuits exploit error resilience property of applications to tradeoff computation qu... more Approximate circuits exploit error resilience property of applications to tradeoff computation quality (accuracy) for gaining advantage in terms of performance, power, and/or area. While state-of-the-art low-latency approximate adders provide an accuracy-area-latency configurable design space, the selection of a particular configuration from the design space is still manually done. In this paper, we analytically analyze different structural properties of low-latency approximate adders to formulate a new adder model, Quality-area optimal Low-Latency approximate Adder (QuAd). It provides an increased design space as compared to state-of-the-art, providing design points that require less logic area for the same accuracy, as compared to state-of-the-art approximate adders. Furthermore, based upon our mathematical analysis, we show that, provided a latency constraint, an adder configuration with the highest quality and lowest area requirement can effortlessly be selected from the whole d...

ReSpawn: Energy-Efficient Fault-Tolerance for Spiking Neural Networks considering Unreliable Memories

Spiking neural networks (SNNs) have shown a potential for having low energy with unsupervised lea... more Spiking neural networks (SNNs) have shown a potential for having low energy with unsupervised learning capabilities due to their biologically-inspired computation. However, they may suffer from accuracy degradation if their processing is performed under the presence of hardware-induced faults in memories, which can come from manufacturing defects or voltage-induced approximation errors. Since recent works still focus on the fault-modeling and random fault injection in SNNs, the impact of memory faults in SNN hardware architectures on accuracy and the respective fault-mitigation techniques are not thoroughly explored. Toward this, we propose ReSpawn, a novel framework for mitigating the negative impacts of faults in both the off-chip and on-chip memories for resilient and energy-efficient SNNs. The key mechanisms of ReSpawn are: (1) analyzing the fault tolerance of SNNs; and (2) improving the SNN fault tolerance through (a) fault-aware mapping (FAM) in memories, and (b) fault-aware t...

Masa depan pesantren : dalam tantangan modernitas dan tantangan kompleksitas global

Paper Title (use style: paper title)

The exponential increase in dependencies between the cyber and physical world leads to an enormou... more The exponential increase in dependencies between the cyber and physical world leads to an enormous amount of data which must be efficiently processed and stored. Therefore, computing paradigms are evolving towards machine learning (ML)-based systems because of their ability to efficiently and accurately process the enormous amount of data. Although MLbased solutions address the efficient computing requirements of big data, they introduce (new) security vulnerabilities into the systems, which cannot be addressed by traditional monitoringbased security measures. Therefore, this paper first presents a brief overview of various security threats in machine learning, their respective threat models and associated research challenges to develop robust security measures. To illustrate the security vulnerabilities of ML during training, inferencing and hardware implementation, we demonstrate some key security threats on ML using LeNet and VGGNet for MNIST and German Traffic Sign Recognition B...

Perancangan Sistem Pengenalan Suara Sebagai Pengendali Laptop Berbasis Arduino Uno

Perkembangan teknologi yang semakin maju memberikan banyak manfaat kepada kehidupan sehari-hari, ... more Perkembangan teknologi yang semakin maju memberikan banyak manfaat kepada kehidupan sehari-hari, salah satu hasil inovasi dari kemajuan perkembangan teknologi ialah perintah suara yang dapat memungkinkan pengguna mengendalikan perangkat elektronik miliknya hanya dengan menggunakan perintah suara untuk menyalakan ataupun mematikan perangkat elektronik tersebut. Suara digital yang yang diucapkan akan diolah dan dikontrol dengan sistem untuk mengenali perintah suara yang terdeteksi. Perancangan sistem pengenalan suara ini bertujuan agar dapat memberikan kemudahan pengguna dalam mengoperasikan laptop dengan menggunakan suara. Perancangan sistem pengenalan suara berbasis Arduino Uno menggunakan modul EasyVR sebagai modul pengenalan suara serta michrophone wireless agar perintah yang diucapkan dapat dilakukan dari jarak yang jauh dari laptop. Hasil penelitian ini diharapkan dapat menjadi sebuah prototype sistem pengenalan suara untuk menghidupkan atau mematikan laptop dengan perintah suar...

Weight Quantization Retraining for Sparse and Compressed Spatial Domain Correlation Filters

Using Spatial Domain Correlation Pattern Recognition (CPR) in Internet-of-Things (IoT)-based appl... more Using Spatial Domain Correlation Pattern Recognition (CPR) in Internet-of-Things (IoT)-based applications often faces constraints, like inadequate computational resources and limited memory. To reduce the computation workload of inference due to large spatial-domain CPR filters and convert filter weights into hardware-friendly data-types, this paper introduces the power-of-two (Po2) and dynamic-fixed-point (DFP) quantization techniques for weight compression and the sparsity induction in filters. Weight quantization re-training (WQR), the log-polar, and the inverse log-polar geometric transformations are introduced to reduce quantization error. WQR is a method of retraining the CPR filter, which is presented to recover the accuracy loss. It forces the given quantization scheme by adding the quantization error in the training sample and then re-quantizes the filter to the desired quantization levels which reduce quantization noise. Further, Particle Swarm Optimization (PSO) is used t...

Approximate computing across the hardware and software stacks

Many-Core Computing: Hardware and Software

Emerging fields like big data and IoT have brought a number of challenges for hardware as well as... more Emerging fields like big data and IoT have brought a number of challenges for hardware as well as software design community. Some of the major challenges are to scale the computational and memory resources and the efficiency of the processing devices as per the growing needs. In the past few years, a number of fields have emerged for addressing these challenges. We focus on one of the prominent paradigms that have the potential to improve the resource efficiency regardless of the underlying technology, i.e., approximate computing (AC). AC aims at relaxing the bounds of exact computing to provide new opportunities for achieving gains in terms of energy, power, performance, and/or area efficiency at the cost of reduced output quality, typically within the tolerable range. We first provide an overview of AC and the techniques which are commonly being employed at different abstraction levels for alleviating the resource requirements of computationally intensive applications. Afterwards, a detailed discussion on component-level approximations and their probabilistic behavior by considering approximate adders and multipliers is presented. At the next step, a methodology used to construct efficient accelerators from these components will be discussed. The discussion will then be extended to approximate memories and runtime management systems. Toward the end of the chapter, we present a methodology for designing energy efficient many-core systems based upon approximate components followed by the challenges in adopting a cross-layer approach for designing highly energy, power, and performance-efficient systems.