Papers by mokhtar nibouche
On designing digit multipliers
The folding and unfolding techniques cannot be used to design pipelined digit adders because of t... more The folding and unfolding techniques cannot be used to design pipelined digit adders because of the presence of feedback loops. In this paper, approaches for the design of digit multipliers that can be pipelined to the bit level are presented. It includes architectures obtained via unfolding, the high radix approach and the multi-pipe approach. The pipelining of these architectures has been made possible thanks to a new "pipelined digit adder". The presented architectures are scalable, systolic and can offer a great flexibility in finding the best trade-off between hardware cost and throughput rate by changing the level of pipelining and the digit size.
Application of DSPS and Microcontrollers in Voltage Source Inverters STATCOM Digital Designs: A Comparative Approach
The purpose of this paper is to present the design and implementation of a laboratory based five-... more The purpose of this paper is to present the design and implementation of a laboratory based five-level voltage source inverter (VSI) used in static synchronous compensators (STATCOM) to constraint power system load bus voltages. It primarily focuses on the application of microcontrollers and microprocessors based design and the control system requirement for the STATCOM and then provide a comparison with a DSP processor from the TMS320C2000 series from Texas Instruments.
An FPGA-based wavelet transforms coprocessor
Although FPGA technology offers the potential of designing high performance systems at low cost f... more Although FPGA technology offers the potential of designing high performance systems at low cost for a wide range of applications, its programming model is prohibitively low level requiring either a dedicated FPGA-experienced programmer or basic digital design knowledge. To allow a signal/image processing end-user to benefit from this kind of device, the level of design abstraction needs to be raised,

Signal Processing, 2004
The new modular multiplier structures proposed in this paper are based on a short precision magni... more The new modular multiplier structures proposed in this paper are based on a short precision magnitude comparison instead of the full magnitude comparison operation. Another feature of these structures is that the comparison operations are carried out first. Only once this has been achieved that the reduction operation takes place, while in previous work both the comparison and the reduction operations are interleaved. This has resulted in a reduction of the number of stages required for the implementation of the modular reduction operation. Serial implementations have shown that the new radix-2 algorithm has a better area usage than similar structures available in the literature while the proposed radix-4 algorithm exhibits better area usage than similar structures with relatively similar speed performances. The parallel implementation of these algorithms has also shown that the new radix-4 algorithm has the best area usage while its speed performances are similar to that of structures proposed in the literature. r (O. Nibouche), [email protected] (M. Nibouche), a.bouridane@ qub.ac.uk (A. Bouridane).
A new pipelined digit serial-parallel multiplier
Digit-serial architectures obtained using traditional unfolding and folding techniques cannot be ... more Digit-serial architectures obtained using traditional unfolding and folding techniques cannot be pipelined beyond a certain level because of the presence of feedback loops. In this paper, a novel approach for the design of pipelined digit serial-parallel multipliers is presented
The most common clues left at a crime scene when a crime is committed are shoeprint impressions. ... more The most common clues left at a crime scene when a crime is committed are shoeprint impressions. These impressions are useful in the detection of criminals and the linking of crime scenes. A novel technique for use in the detection and classification of shoeprint impressions has been developed. The technique is based on fractal based feature extraction and pattern matching methods. The computerized system developed has been extensively tested on a large database of real shoeprint impressions and is robust to small variations of image orientations and/or translations
Rapid prototyping of biorthogonal discrete wavelet transforms on FPGAs
The purpose of this paper is to present a methodology for rapid prototyping of biorthogonal wavel... more The purpose of this paper is to present a methodology for rapid prototyping of biorthogonal wavelet transforms on FPGAs. The methodology is based on adequate partitioning of a time interleaved "wait cycles" free architecture. The design has been captured using a schematic capture tools and can be parameterised in terms of the number of filter coefficients, data and coefficient word-lengths, digit size and degree of pipelining. The efficiency of the approach has been verified on the Xilinx 4000 FPGA series
Bit-level architectures for Montgomery's multiplication
Algorithms and architectures for performing modular multiplication operations are important in cr... more Algorithms and architectures for performing modular multiplication operations are important in cryptography and Residue Number System. In this paper Montgomery's algorithm has been broken into two concurrent no-interleaved multiplication operations. The architectures derived from this algorithm are systolic and need near communication links only. Thus, very well suited for VLSI implementation. The presented architectures offer a great flexibility of finding the best trade-off between hardware cost and throughput rate by changing the digit size
Design and FPGA implementation of orthonormal discrete wavelet transforms
FPGA technology offers the potential for low cost and high performance for certain applications, ... more FPGA technology offers the potential for low cost and high performance for certain applications, including image processing. However, the programming model which FPGAs typically present to application developers is prohibitively low level. The purpose of this paper is to present a novel bit-serial architecture based on a time-interleaved structure. To overcome the problem of wait cycles within the structure, a second line of bit adders is provided. This allows the structure to use additional “dummy” cycles to deal with additional bits. The proposed architecture is modular and scalable, which allows a bit-level parameterisation. To assess the effectiveness of the approach the design has been implemented efficiently on the Xilinx 4000 series FPGAs

Speed and area trade-offs for FPGA-based implementation of RSA architectures
In this paper, new structures that implement RSA cryptographic algorithm are presented. These str... more In this paper, new structures that implement RSA cryptographic algorithm are presented. These structures are built upon a modified Montgomery modular multiplier, where the operations of multiplication and modular reduction are carried out in parallel rather than interleaved as in the traditional Montgomery multiplier. The digit approach has been adopted in this paper. This methodology is based on varying the digit size and the level of pipelining of the structures. This parameterised approach presents the designer with an efficient way of choosing the architecture which suits better the user requirements in terms of speed and area usage, an issue of critical importance to the resources-limited FPGA chips. Furthermore, the global broadcast data lines are avoided by interleaving multiple encryption operations into the same structure, thus making the implementation systolic. The results of implementation in FPGA have shown that the proposed RSA structures outperformed those structures built around the traditional Montgomery multiplier in terms of speed and area usage.
New iterative algorithms and architectures of modular multiplication for cryptography
Algorithms and architectures for performing modular multiplication operations, which is central t... more Algorithms and architectures for performing modular multiplication operations, which is central to crypto-system and authentication schemes, are important in today's needs of secure communications. This paper presents two new iterative algorithms for modular multiplication. The implementation of these algorithms yields to scalable architectures that can be used for any modulus without altering the design. In addition, the Radix-2 algorithm shows almost similar features when compared with similar architectures available in the literature. Furthermore, the radix-4 algorithm can be used to develop higher radix algorithms since it only requires the use of powers of two of the modulus
Architectures for Montgomery's multiplication
Iee Proceedings-computers and Digital Techniques, 2003
ABSTRACT
A new bi-directional bit serial-parallel multiplication architecture is presented. The proposed s... more A new bi-directional bit serial-parallel multiplication architecture is presented. The proposed structure is regular and modular, and requires nearest neighbour communication links only, which makes it more efficient for VLSI implementation. Furthermore, a judicious deployment of latches in the circuit ensures that the multiplier operates on two coefficients of the multiplicand at the same time thus speeding up the process. Comparison of the new multiplier structure with previous ones has shown the superiority of the new architecture
Psychnology Journal, 2003
In this paper, new structures that implement RSA cryptographic algorithm are presented. These str... more In this paper, new structures that implement RSA cryptographic algorithm are presented. These structures are built using a modified Montgomery modular multiplier, where the operations of multiplication and modular reductions are carried out in parallel rather than interleaved as in the traditional Montgomery multiplier. The global broadcast data lines are avoided by interleaving two operations into the same structure, thus making the implementation systolic. The results of implementation in FPGA have shown that the proposed RSA structures outperformed those structures built around a traditional Montgomery multiplier in terms of speed. In terms of area usage, an area-efficient architecture is shown in this paper that has the merit of having a high speed and a reduced area usage when compared with other architectures.
An FPGA-based wavelet transforms coprocessor
Although FPGA technology offers the potential of designing high performance systems at low cost f... more Although FPGA technology offers the potential of designing high performance systems at low cost for a wide range of applications, its programming model is prohibitively low level requiring either a dedicated FPGA-experienced programmer or basic digital design knowledge. To allow a signal/image processing end-user to benefit from this kind of device, the level of design abstraction needs to be raised,

New architectures for serial-serial multiplication
Traditional serial-serial multiplier structures suffer from an inefficient generation of partial ... more Traditional serial-serial multiplier structures suffer from an inefficient generation of partial products, which leads to hardware overuse and slow speed systems. In this paper, two new architectures for fully serial multiplication are presented. To the best of our knowledge, the first structure is the first fully serial multiplier reported in the literature with comparable performance-in terms of speed-to existing serial-parallel multipliers. The second structure requires an extra multiplexer in the clock path thus making it slower, but has the merit of reducing the latency of the multiplier. Both structures are systolic and need near communication links only. Compared with available architectures, an FPGA based implementation has shown an increase in the speed of the multipliers by about 200% for the first structure and 150% for the second structure

Signal Processing, 2004
The new modular multiplier structures proposed in this paper are based on a short precision magni... more The new modular multiplier structures proposed in this paper are based on a short precision magnitude comparison instead of the full magnitude comparison operation. Another feature of these structures is that the comparison operations are carried out first. Only once this has been achieved that the reduction operation takes place, while in previous work both the comparison and the reduction operations are interleaved. This has resulted in a reduction of the number of stages required for the implementation of the modular reduction operation. Serial implementations have shown that the new radix-2 algorithm has a better area usage than similar structures available in the literature while the proposed radix-4 algorithm exhibits better area usage than similar structures with relatively similar speed performances. The parallel implementation of these algorithms has also shown that the new radix-4 algorithm has the best area usage while its speed performances are similar to that of structures proposed in the literature. r (O. Nibouche), [email protected] (M. Nibouche), a.bouridane@ qub.ac.uk (A. Bouridane).

In this work, new structures that implement RSA cryptographic algorithm are presented. These stru... more In this work, new structures that implement RSA cryptographic algorithm are presented. These structures are built upon a modified Montgomery modular multiplier, where the operations of multiplication and modular reductions are carried out in parallel rather than interleaved as in the traditional Montgomery multiplier. The global broadcast of data lines is avoided by interleaving two or more encryption/decryption operations onto the same structure, thus making the implementation systolic and scalable. The digit approach has been adopted in This work. This methodology is based on varying the digit size and the level of pipelining of the structures. This parameterised approach presents the designer with an efficient way of choosing the architecture that suits better his/her requirements in terms of speed and area usage, an issue of critical importance to the resources-limited FPGA chips. The results of implementation using FPGA have shown that the proposed RSA structures outperformed those structures built around the traditional Montgomery multiplier in terms of speed, thanks to avoiding global lines broadcast.
A new pipelined digit serial-parallel multiplier
Digit-serial architectures obtained using traditional unfolding and folding techniques cannot be ... more Digit-serial architectures obtained using traditional unfolding and folding techniques cannot be pipelined beyond a certain level because of the presence of feedback loops. In this paper, a novel approach for the design of pipelined digit serial-parallel multipliers is presented
Although FPGA technology offers the potential of designing high performance systems at low cost, ... more Although FPGA technology offers the potential of designing high performance systems at low cost, its programming model is prohibitively low level. To allow a novice signal/image processing end-user to benefit from this kind of devices, the level of design abstraction needs to be raised. This approach will help the application developer to focus on signal/image processing algorithms rather than on low-level designs and implementations. This paper presents a framework for an FPGA-based Discrete Wavelet Transform system. The approach helps the end-user to generate FPGA configurations for DWT at a high level without any knowledge of the low-level design styles and architectures.
Uploads
Papers by mokhtar nibouche