Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
Recently, a number of candidate instructions have been proposed to efficiently compute arbitrary bit permutations. Among these, GRP is the most attractive, having utility for other applications in addition to permutation such as sorting and having good inherent cryptographic properties. However, the current implementation of GRP is the slowest of the candidates; BFLY, on the other hand, is the fastest. In this paper, we examine the possibility of executing GRP on a butterfly or an inverse butterfly network.
Permutation is widely used in cryptographic algorithms. However, it is not well-supported in existing instruction sets. In this paper, two instructions, PPERM3R and GRP, are proposed for efficient software implementation of arbitrary permutations. The PPERM3R instruction can be used for dynamically specified permutations; the GRP instruction can be used to do arbitrary n-bit permutations with up to lg(n) instructions. In addition, a systematic method for determining the instruction sequence for performing an arbitrary permutation is described.
sic word-orientation of processors, and their ability to support next-generation secure multimedia processing. However, bitwise permutations are also fundamental operations in many cryptographic primitives and we discuss the suitability of these new operations for cryptographic purposes.
International Journal of Embedded Systems, 2008
Block ciphers are used to encrypt data and provide data confidentiality. For interoperability reasons, it is desirable to support a variety of block ciphers efficiently. Of the basic operations in block ciphers, only bit permutation is very slow on existing processors, followed by integer multiplication. Although new permutation instructions proposed recently can accelerate bit permutations in general-purpose processors, reducing the number of instructions needed to achieve an arbitrary n-bit permutation from O(n) to O(log 2 (n)), the data dependency between permutation instructions prevents them from being executed in fewer than log 2 (n) cycles, even on superscalar processors. Since Application-Specific Instruction-Set Processors (ASIPs) have fewer constraints on maintaining standard processor datapath and control conventions, six alternative ASIP approaches are proposed in this paper to achieve arbitrary 64-bit permutations in one or two cycles without increasing the cycle time. These approaches use new BFLY and IBFLY instructions. We also compare these approaches and their efficiency in performing arbitrary 64-bit permutations.
Several bit permutation instructions, including GRP, OMFLIP, CROSS, and BFLY, have been proposed recently for efficiently performing arbitrary bit permutations. Previous work has shown that these instructions can accelerate a variety of applications such as block ciphers and sorting algorithms. In this paper, we compare the implementation complexity of these instructions in terms of delay. We use logical effort, a process technology independent method, to estimate the delay of the bit permutation functional units. Our results show that for 64-bit operations, the BFLY instruction is the fastest among these bit permutation instructions; the OMFLIP instruction is next; and the GRP instruction is the slowest.
As the number of Electronic Control Units (ECU's) is increasing in an embedded system this opens a wide gateway for manipulating the data inside a system. In an automotive system 30 to 40 ECU's communicate with each other over a Control area network (CAN) bus. The data communicated over this bus should be encrypted and authenticated; otherwise this can lead to a dire consequence. The threats from the external environment to a network may disrupt the communication and inverse the desired results. A system, architectures have been developed in the past that provides foolproof security to these concerns. The models like EVITA, SEVECOM specifically designed for automotive security that provides good resistance to internal as well as external threats. These models use heavy cryptographic engines like AES, Elliptical Curves, hash engines that provide rich encryption standards, but these solutions are complex. These crypt engines consume more power, more footprint area and this becomes extremely difficult to get implemented in small scale embedded systems. The algorithm proposed in this paper uses bit level permutation instructions like GRP, OMFLIP that has good cryptographic properties and have less footprint area and less power consumption. This analysis will have a positive impact on upcoming trends in securing an environment. Based on the need and constraint, this paper introduces a methodical and holistic approach of using bit permutation instructions in cryptographic environment that not only accelerates cryptography, but also give a low cost, low area solution for securing any network consisting of small scale embedded controllers. This proposal focuses at ceaseless need for securing a system with less power consumption, less area. Implementation of bit level permutation instructions in hardware will address these concerns and will pave way for a security solution that requires less footprint area.
The increasing importance of secure information processing in publicly accessible Internet and wireless networks poses new challenges in the architecture of future generalpurpose processors. Symmetric-key cryptography algorithms are an important class of algorithms used to achieve confidentiality. Many of them use a category of operations that are not well supported by today's word-oriented microprocessors: bit-oriented permutations. In this paper, we show how arbitrary bit permutations within a word can be achieved in just one or two cycles. This improves upon the O(n) instructions needed to achieve any one of n! permutations of n bits in existing RISC processors; it also improves upon recent work that achieves this in O(log(n)) instructions and cycles. This paper contributes two new architectural solutions, one with only microarchitecture changes, and another with ISA support as well.
¢ -bit permutations in programmable processors modeled on the theory of omega and flip networks. The new omflip instruction we introduce can perform any permutation of ¢ subwords in £ ¥ ¤ § ¦ ¢ instructions, with the subwords ranging from half-words down to single bits. Each omflip instruction can be done in a single cycle, with very efficient hardware implementation. The omflip instruction enhances a programmable processor's capability for handling multimedia and security applications which use subword permutations extensively.
WSEAS Transactions on Computers
Security in every real time applications is of utmost importance. The secure architecture implemented in the automobiles such as EVITA (E-safety Vehicle Intrusion protected Application), SEVECOM (Secure Vehicle Communication) has rich cryptographic properties, but has more footprint area and high power consumption. This existing architecture uses standard engines like AES (Advanced encryption standard), Elliptical curves, Hash Engines which are heavy in memory requirement and consumes more power. So its reach is limited only to high end systems that consisting of large bit processors and coprocessors. Role of a bit permutation instruction in cryptographic environment is well proven. GRP (Group Operations) and OMFLIP (Omega-Flip) networks are bit permutation instructions and its implementation in hardware not only accelerates software cryptography but also results in less footprint area and low power consumption. This paper proposes a novel implementation and analysis of GRP and OMFL...
Bit permutation operations are interesting and important from both cryptographic and architectural points of view. Cryptographically, bit-level permutations naturally provide certain effects which are not easily obtained through word-level operations.
WSEAS Transactions on Information Science and Applications
With the increasing use of electronic control units (ECU's) in automobile or in any embedded system security becomes an area of grave concern. Information is exchanged between ECU's over a CAN (Control Area Network) bus, vehicle to infrastructure (V2I) and vehicle to vehicle (V2V) communication. These interactions open a wide gateway for manipulating information which could lead to disastrous results. EVITA, SEVECOM, SHE are existing security models to address these concerns in automobiles but at the cost of huge footprint area and more power consumption as it uses cryptographic engines like AES-128,ECC, HMAC. We propose the use of bit level permutation GRP (group operations) in cryptographic environment which not only accelerates cryptography but also has a positive impact of providing low cost security solution that is having good encryption standards, relatively less footprint area, less cost and low power consumption. Use of GRP in cryptographic environment is a unique s...
New and emerging applications can change the mix of operations commonly used within computer architectures. It is sometimes surprising when instruction-set architecture (ISA) innovations intended for one purpose are used for other (initially unintended) purposes. This chapter considers recent proposals for the processor support of families of bit-level permutations. From a processor architecture point of view, the ability to support very fast bit-level permutations may be viewed as a further validation of the basic word-orientation of processors, and their ability to support next-generation secure multimedia processing. However, bitwise permutations are also fundamental operations in many cryptographic primitives and we discuss the suitability of these new operations for cryptographic purposes.
2016
A Half Butterfly Method is a new method introduced to construct the distinct circuits in complete graphs where used the concept of isomorphism. The Half Butterfly Method can be applied in the field of combinatorics such as in listing permutations of elements. Thus, in this paper, we presented a permutation generation using Half Butterfly Method.
IEEE Transactions on Signal Processing, 2013
A mathematical characterization of serially-pruned permutations (SPPs) employed in variable-length permuters and their associated fast pruning algorithms and architectures are proposed. Permuters are used in many signal processing systems for shuffling data and in communication systems as an adjunct to coding for error correction. Typically only a small set of discrete permuter lengths are supported. Serial pruning is a simple technique to alter the length of a permutation to support a wider range of lengths, but results in a serial processing bottleneck. In this paper, parallelizing SPPs is formulated in terms of recursively computing sums involving integer floor and related functions using integer operations, in a fashion analogous to evaluating Dedekind sums. A mathematical treatment for bit-reversal permutations (BRPs) is presented, and closed-form expressions for BRP statistics including descents/ascents, major index, excedances/descedances, inversions, and serial correlations are derived. It is shown that BRP sequences have weak correlation properties. Moreover, a new statistic called permutation inliers that characterizes the pruning gap of pruned interleavers is proposed. Using this statistic, a recursive algorithm that computes the minimum inliers count of a pruned BR interleaver (PBRI) in logarithmic time complexity is presented. This algorithm enables parallelizing a serial PBRI algorithm by any desired parallelism factor by computing the pruning gap in lookahead rather than a serial fashion, resulting in significant reduction in interleaving latency and memory overhead. Extensions to 2-D block and stream interleavers, as well as applications to pruned fast Fourier transforms and LTE turbo interleavers, are also presented. Moreover, hardware-efficient architectures for the proposed algorithms are developed. Simulation results of interleavers employed in modern communication standards demonstrate 3 to 4 orders of magnitude improvement in interleaving time compared to existing approaches.
Journal of Signal Processing Systems, 2008
Advanced bit manipulation operations are not efficiently supported by commodity word-oriented microprocessors. Programming tricks are typically devised to shorten the long sequence of instructions needed to emulate these complicated bit operations. As these bit manipulation operations are relevant to applications that are becoming increasingly important, we propose direct support for them in microprocessors. In particular, we propose fast bit gather (or parallel extract), bit scatter (or parallel deposit) and bit permutation instructions (including group, butterfly and inverse butterfly). We show that all these instructions can be implemented efficiently using both the fast butterfly and inverse butterfly network datapaths. Specifically, we show that parallel deposit can be mapped onto a butterfly circuit and parallel extract can be mapped onto an inverse butterfly circuit. We define static, dynamic and loop invariant versions of the instructions, with static versions utilizing a much simpler functional unit. We show how a hardware decoder can be implemented for the dynamic and loop-invariant versions to generate, dynamically, the control signals for the butterfly and inverse butterfly datapaths. The simplest functional unit we propose is smaller and faster than an ALU. We also show that these instructions yield significant speedups over a basic RISC architecture for a variety of different application kernels taken from applications domains including bioinformatics, steganography, coding, compression and random number generation.
IJARCCE, 2017
Permutation is different arrangements that can be made with a given number of things taking some or all of them at a time. The notation P(n,r) is used to denote the number of permutations of n things taken r at a time. Permutation used in the pattern analysis, Databases and data mining, simulation, accumulation of electronic communication data, Homeland security, wireless networks. Bottom-Up, Lexicography, Johnson-Trotter, Backtracking, Heap algorithm, Brute-Force are of the most popular permutation algorithms that emerged during the past decades. Lexicography used in graph colouring. Backtracking are used in Sudoku. A performance analysis in terms of time complexity is done by implementing algorithms in different programming languages on different platforms. In cryptography, cipher text is the result of encryption performed on plain text using an algorithm called cipher which is used in client-server application.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.