Academia.eduAcademia.edu

Comparing Fast Implementations of Bit Permutation Instructions

Abstract

Recently, a number of candidate instructions have been proposed to efficiently compute arbitrary bit permutations. Among these, GRP is the most attractive, having utility for other applications in addition to permutation such as sorting and having good inherent cryptographic properties. However, the current implementation of GRP is the slowest of the candidates; BFLY, on the other hand, is the fastest. In this paper, we examine the possibility of executing GRP on a butterfly or an inverse butterfly network.