0% found this document useful (0 votes)
13 views4 pages

Coordinate Rotation Digital Computer Algorithm Des

The document discusses the COordinate Rotation DIgital Computer (CORDIC) algorithm, highlighting its efficiency and low-cost implementation for various applications such as trigonometric functions and scientific computations. It reviews advancements in CORDIC architectures aimed at improving throughput and reducing latency, including parallel and pipelined designs. The paper also categorizes different CORDIC architectures, such as bit parallel and bit serial implementations, and emphasizes their relevance in fields like signal processing and robotics.

Uploaded by

Zoro Zoro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views4 pages

Coordinate Rotation Digital Computer Algorithm Des

The document discusses the COordinate Rotation DIgital Computer (CORDIC) algorithm, highlighting its efficiency and low-cost implementation for various applications such as trigonometric functions and scientific computations. It reviews advancements in CORDIC architectures aimed at improving throughput and reducing latency, including parallel and pipelined designs. The paper also categorizes different CORDIC architectures, such as bit parallel and bit serial implementations, and emphasizes their relevance in fields like signal processing and robotics.

Uploaded by

Zoro Zoro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

(IJACSA) International Journal of Advanced Computer Science and Applications,

Vol. 2, No. 4, 2011

Coordinate Rotation Digital Computer Algorithm:


Design and Architectures
Naveen Kumar Amandeep Singh Sappal
Electronics & Communication Engineering Electronics & Communication Engineering
University College of Engineering University College of Engineering
Punjabi University, Patiala Punjabi University, Patiala
Punjab, India Punjab, India

Abstract— COordinate Rotation DIgital Computer (CORDIC) The development of CORDIC algorithm and
algorithm has potential for efficient and low-cost implementation architecture has taken place for achieving high throughput rate
of a large class of applications which include the generation of and reduction of hardware-complexity as well as the latency of
trigonometric, logarithmic and transcendental elementary implementation. Latency of implementation is an inherent
functions, complex number multiplication, matrix inversion, drawback of the conventional CORDIC algorithm. Angle
solution of linear systems and general scientific computation. recoding schemes and higher radix CORDIC have been
This paper presents a brief overview of the developments in the developed for reduced latency realization. Parallel and
CORDIC algorithm and its architectures. pipelined CORDIC have been suggested for high-throughput
computation.
Keywords- CORDIC Algorithms; CORDIC Architectures; FPGA.
This paper presents an overview of the development of
I. INTRODUCTION CORDIC algorithm. The paper is organized as follows: Section
FIRST described in 1959 [1], CORDIC algorithm is an II discusses the basics of CORDIC algorithm, different
iterative algorithm, which can be used for the computation of CORDIC architectures are discussed in Section III. The
trigonometric functions, multiplication and division. Last half conclusion along with future research directions are discussed
century has witnessed a lot of progress in design and in Section IV.
development of architectures of the algorithm for high-
performance and low-cost hardware solutions. CORDIC II. DEFINITION OF CORDIC
algorithm got its popularity, when [2] showed that, by varying The CORDIC is very simple and iterative convergence
a few simple parameters, it could be used as a single algorithm algorithm that reduces complex multiplication, greatly
for unified implementation of a wide range of elementary simplifying overall hardware complexity. This serves as an
transcendental functions involving logarithms, exponentials, attractive option to system designers as they continue to face
and square. During the same time, [3] showed that CORDIC the challenges of balancing aggressive cost and power targets
technique is a better choice for scientific calculator with the increased performance required in next generation
applications. signal processing solutions. The basic principle underlying the
CORDIC-based computation, and present its iterative
The popularity of CORDIC was very much enhanced algorithm for different operating modes and planar coordinate
thereafter primarily due to its potential for efficient and low- system.
cost implementation. With the advent of low cost, low power
FPGAs, this algorithm has shown its potential for efficient and A. Overview of CORDIC Algorithm
low-cost implementation. CORDIC algorithm can be widely CORDIC algorithm has two types of computing modes
used in as wireless communications, Software Defined Radio Vector rotation and vector translation. The CORDIC algorithm
and medical imaging applications, which are heavily dependent was initially designed to perform a vector rotation, where the
on signal processing. Some other upcoming applications are: vector V with components (X,Y) is rotated through the angle
 Direct frequency synthesis, digital modulation and  yielding a new vector V ' with component (X’,Y’) shown
coding for speech/music synthesis and in Fig. 1.
communication;
 Direct and inverse kinematics computation for robot V '  [ R][V ] (1)
manipulation; where R is the rotation matrix:
 Planar and three-dimensional vector rotation for
graphics and animation.
Although CORDIC may not be the fastest technique to
perform these operations, yet it is attractive due to the
simplicity and efficient hardware implementation.

68 | P a g e
[Link]
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 4, 2011

To achieve simplicity of hardware realization of the


rotation, the key ideas used in CORDIC arithmetic are to
decompose the rotations into a sequence of elementary
rotations through predefined angles that could be implemented
with minimum hardware cost and to avoid scaling, that might
involve arithmetic operation, such as square-root and division.
The second idea is based on the fact the scale-factor contains
only the magnitude information but no information about the
angle of rotation.
Figure 1: Vector Rotation B. Generalized CORDIC Algorithm
cos   sin   After few years, Walther found how CORDIC iterations
R (2)
 sin  cos   could be modified to compute hyperbolic functions [2] and
reformulated the CORDIC algorithm in to a generalized and
unified form which is suitable to perform rotations in circular,
 1 tan  
  
hyperbolic and linear coordinate systems. The unified
1  tan 
2
1  tan 2  (3) formulation includes a new variable m , which is assigned
R 
 tan  1  different values for different coordinate systems. The
  generalized CORDIC is formulated as follows:
 1  tan  1  tan 2  
2

By factoring out the cosine term in (3), the rotation matrix xi 1  xi  m i .2 i. yi
R can be rewritten as
yi 1  yi   i .2 [Link] (9)

R  1  tan 2  
1/2
  1  tan   wi 1  wi   i . i
  tan  1 
(4)

 sign( wi ) for rotation mode
and can be interpreted as a product of a scale-factor Here i  
 sign( wi ) for vectoring mode
K  1  tan 2    with a pseudo rotation matrix Rc ,
1/2

  III. CORDIC ARCHITECTURES


given by
CORDIC computation is inherently sequential due to two
main bottlenecks firstly the micro-rotation for any iteration is
 1  tan   performed on the intermediate vector computed by the previous
Rc  
1 
(5) iteration and secondly the (i+1)th iteration could be started only
 tan  after the completion of the ith iteration, since the value of  i 1
In vector translation, rotates the vector V with component
which is required to start the (i+1)th iteration could be known
(X, Y) around the circle until the Y component equals zero as
only after the completion of the ith iteration. To alleviate the
illustrated in Fig. 2. The outputs from vector translation are the
second bottleneck some attempts have been made for
magnitude X’ and phase  , of the input vector V.
'
evaluation of  i values corresponding to small micro-rotation
After vector translation, output equations are: angles [4]. However, the CORDIC iterations could not still be
performed in parallel due to the first bottleneck. A partial
X' = K i X 2
Y2 
(6)
parallelization has been realized in [4] by combining a pair of
conventional CORDIC iterations into a single merged iteration
which provides better area-delay efficiency. But the accuracy is
Y 0
'
(7) slightly affected by such merging and cannot be extended to a
Y  higher number of conventional CORDIC iterations since the
 '  a tan   induced error becomes unacceptable [5]. Parallel realization of
X (8) CORDIC iterations to handle the first bottleneck by direct
unfolding of micro-rotation is possible, but that would result in
increase in computational complexity and the advantage of
simplicity of CORDIC algorithm gets degraded [6]. Although
no popular architectures are known to us for fully parallel
implementation of CORDIC, different forms of pipelined
implementation of CORDIC have however been proposed for
improving the computational throughput [7].To handle latency
bottlenecks, various architectures have been developed and
reported in this review. Most of the well-known architectures
could be grouped under bit parallel iterative CORDIC, bit
Figure 2: Vector Translation parallel unrolled CORDIC , bit serial iterative CORDIC and

69 | P a g e
[Link]
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 4, 2011

pipelined CORDIC architecture which we discuss briefly in the B. Bit Parallel Unrolled CORDIC Architecture
following subsections.
A. Bit Parallel Iterative CORDIC Architecture
The vector Rotation CORDIC structure is represented by
the schematics in Fig. 3. Each branch consists of an adder-
subtractor combination, a shift unit and a register for buffering
the output. At the beginning of a calculation initial values are
fed into the register by the multiplexer where the MSB of the
stored value in the z-branch determines the operation mode for
the adder-subtractor. Signals in the x and y branch pass the
shift units and are then added to or subtracted from the
unshifted signal in the opposite path. The z branch
arithmetically combines the registers values with the values
taken from a lookup table (LUT) whose address is changed
accordingly to the number of iteration. For n iterations the
output is mapped back to the registers before initial values are
fed in again and the final sine value can be accessed at the
output. A simple finite-state machine is needed to control the
multiplexers, the shift distance and the addressing of the
constant values.
Figure 4: Unrolled CORDIC
When implemented in an FPGA the initial values for the
vector coordinates as well as the constant values in the LUT
can be hardwired in a word wide manner. The adder and the Instead of buffering the output of one iteration and using
subtractor component are carried out separately and a the same resources again, one could simply cascade the
multiplexer controlled by the sign of the angle accumulator iterative CORDIC, which means rebuilding the basic CORDIC
distinguishes between addition and subtraction by routing the structure for each iteration. Consequently, the output of one
signals as required. The shift operations as implemented stage is the input of the next one, as shown in Fig. 4, and in the
change the shift distance with the number of iterations but face of separate stages two simplifications become possible.
those require a high fan in and reduce the maximum speed for First, the shift operations for each step can be performed by
the application. In addition the output rate is also limited by the wiring the connections between stages appropriately. Second,
fact that operations are performed iteratively and therefore the there is no need for changing constant values and those can
maximum output rate equals 1/n times the clock rate. therefore be hardwired as well. The purely unrolled design only
consists of combinatorial components and computes one sine
value per clock cycle. Input values find their path through the
architecture on their own and do not need to be controlled. As
we know, the area in FPGAs can be measured in CLBs, each of
which consist of two lookup tables as well as storage cells with
additional control components. For the purely combinatorial
design the CLB's function generators perform the add and shift
operations and no storage cells are used. This means registers
could be inserted easily without significantly increasing the
area. Pipelining ads some latency, of course, but the application
needs to output values at 48 kHz and the latency for 14
iterations equals 312.5  s which are known to be
imperceptible. However, inserting registers between stages
would also reduce the maximum path delays and
correspondingly a higher maximum speed can be achieved.
Figure 3: Iterative CORDIC
C. Bit Serial Iterative CORDIC Architecture
Both, the unrolled and the iterative bit-parallel designs,
show disadvantages in terms of complexity and path delays

70 | P a g e
[Link]
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 4, 2011

going along with the large number of cross connections hardware through repeated shift-add operations. This feature
between single stages. To reduce this complexity one could makes it attractive for a wide variety of applications. Moreover,
change the design into a completely bit-serial iterative its applications in several diverse areas including signal
architecture. Bit-serial means only one bit is processed at a processing, image processing, communication, robotics and
time and hence the cross connections become one bit-wide data graphics apart from general scientific and technical
paths. Clearly, the throughput becomes a function of In spite of computations have been explored. In the last half century,
this the output rate can be almost as high as achieved with the several algorithms and architectures have been developed to
unrolled design. The reason is the structural simplicity of a bit- speed up the CORDIC algorithm by reducing its iteration
serial design and the correspondingly high clock rate counts and through its pipelined implementation.
achievable. Fig. 5 shows the basic architecture of the bit serial
CORDIC processor. ACKNOWLEDGMENT
The authors would thanks the reviewers for their help in
clock rate improving the document.
number of iterations  word length REFERENCES
[1] J. E. Volder, “The CORDIC trigonometric computing technique,” IRE
Transactions on Electronic Computers, vol. EC- 8, pp. 330–334, Sept.
1959.
[2] J. S. Walther, “A unified algorithm for elementary functions,” in
Proceedings of the 38th Spring Joint Computer Conference, Atlantic
City, NJ, 1971, pp.379–385.
[3] D. S. Cochran, “Algorithms and accuracy in the HP-35,” Hewlett-Packard
Journal, pp. 1–11, June 1972.
[4] S. Wang, V. Piuri, and J. E. E. Swartzlander, “Hybrid CORDIC
algorithms,”IEEE Transactions on Computers, volume 46, no. 11, pp.
1202–1207, November1997.
[5]S. Wang and E. E. Swartzlander, “Merged CORDIC algorithm,” in IEEE
International Symposium on Circuits Systems (ISCAS’95),1995, volume
3, pp.1988–1991.
[6] B. Gisuthan and T. Srikanthan, “Pipelining flat CORDIC based
trigonometric function generators,” Microelectronics Journal, volume
33, Pp.77–89, 2002.
[7] E. Deprettere, P. Dewilde, and R. Udo, “Pipelined CORDIC architectures
for fast VLSI filtering and array processing,” in IEEE International
Conference on Acoustic, Speech, Signal Processing, ICASSP’84, March
Figure 5: Bit-serial CORDIC 1984, volume 9, pp.250–253.
D. D. Pipelined CORDIC Architecture [8] D. E. Metafas and C. E. Goutis, “A floating point pipeline CORDIC
processor with extended operation set,” in IEEE International
Since the CORDIC iterations are identical, it is very much Symposium on Circuits and Systems, ISCAS’91, June 1991, volume 5,
convenient to map them into pipelined architectures. The main pp. 3066–3069.
emphasis in efficient pipelined implementation lies with the
minimization of the critical path. The earliest pipelined AUTHORS PROFILE
architecture that we find was suggested in 1984. Pipelined Naveen Kumar received the Bachelor of Technology ([Link]) degree in
CORDIC circuits have been used thereafter for high- 2009. Currently he is pursuing Master of Technology ([Link]) in Electronics
& Communication from Punjabi University Patiala, India.
throughput implementation of sinusoidal wave generation,
fixed and adaptive filters, discrete orthogonal transforms and Amandeep Singh Sappal has submitted his Ph.D. in Electronics &
other signal processing applications [8]. Communication at Punjabi University Patiala and presently he is working as
an Assistant Professor in Punjabi University Patiala, India. He has published
IV. CONCLUSION more than 25 papers in reputed journals and conferences. He is reviewer of
prestigious journals like Elsevier and Springer etc. Presently he is guiding 5
CORDIC algorithm can be implemented by using simple [Link] students.

71 | P a g e
[Link]

You might also like