0% found this document useful (0 votes)
139 views22 pages

FFT Basics for ECE Students

The document discusses the fast Fourier transform (FFT) algorithm. It begins by outlining the basic workings of the radix-2 decimation-in-time FFT algorithm. It then shows how the FFT provides significant computational savings over direct computation of the discrete Fourier transform (DFT) by reducing the operation count from O(N^2) to O(NlogN). Specifically, it decomposes the DFT into smaller DFTs using the Cooley-Tukey algorithm, represented via a signal flowgraph. This decomposition process is continued until only basic 2-point DFTs or "butterflies" remain.

Uploaded by

xenightmare
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
139 views22 pages

FFT Basics for ECE Students

The document discusses the fast Fourier transform (FFT) algorithm. It begins by outlining the basic workings of the radix-2 decimation-in-time FFT algorithm. It then shows how the FFT provides significant computational savings over direct computation of the discrete Fourier transform (DFT) by reducing the operation count from O(N^2) to O(NlogN). Specifically, it decomposes the DFT into smaller DFTs using the Cooley-Tukey algorithm, represented via a signal flowgraph. This decomposition process is continued until only basic 2-point DFTs or "butterflies" remain.

Uploaded by

xenightmare
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 22

18-791 Lecture #17

INTRODUCTION TO THE
FAST FOURIER TRANSFORM ALGORITHM

Richard M. Stern

Department of Electrical and Computer Engineering


Carnegie Mellon University
Pittsburgh, Pennsylvania 15213
Phone: +1 (412) 268-2535
FAX: +1 (412) 268-3890
[email protected]
http://www.ece.cmu.edu/~rms
October 24, 2005
Introduction

 Today we will begin our discussion of the family of algorithms


known as “Fast Fourier Transforms”, which have revolutionized
digital signal processing
 What is the FFT?
– A collection of “tricks” that exploit the symmetry of the DFT calculation to
make its execution much faster
– Speedup increases with DFT size

 Today - will outline the basic workings of the simplest


formulation, the radix-2 decimation-in-time algorithm
 Thursday - will discuss some of the variations and extensions
– Alternate structures
– Non-radix 2 formulations
Carnegie
Mellon Slide 2 ECE Department
Introduction, continued

 Some dates:
– ~1880 - algorithm first described by Gauss
– 1965 - algorithm rediscovered (not for the first time) by Cooley and
Tukey

 In 1967 (spring of my freshman year), calculation of a 8192-


point DFT on the top-of-the line IBM 7094 took ….
– ~30 minutes using conventional techniques
– ~5 seconds using FFTs

Carnegie
Mellon Slide 3 ECE Department
Measures of computational efficiency

 Could consider
– Number of additions
– Number of multiplications
– Amount of memory required
– Scalability and regularity

 For the present discussion we’ll focus most on number of


multiplications as a measure of computational complexity
– More costly than additions for fixed-point processors
– Same cost as additions for floating-point processors, but number of
operations is comparable

Carnegie
Mellon Slide 4 ECE Department
Computational Cost of Discrete-Time Filtering

Convolution of an N-point input with an M-point unit sample


response ….

 Direct convolution:

y[n]   x[k]h[n  k]
k

– Number of multiplies ≈ MN

Carnegie
Mellon Slide 5 ECE Department
Computational Cost of Discrete-Time Filtering

Convolution of an N-point input with an M-point unit sample


response ….
 Using transforms directly:
N 1
X[k]   x[n]e  j2kn / N
n0
2
– Computation of N-point DFTs requires N multiplys
– Each convolution requires three DFTs of length N+M-1 plus an
additional N+M-1 complex multiplys or
2
3(N  M  1)  (N  M  1)

2
– For N  M , for example, the computation is O(N )

Carnegie
Mellon Slide 6 ECE Department
Computational Cost of Discrete-Time Filtering

Convolution of an N-point input with an M-point unit sample


response ….
 Using overlap-add with sections of length L:
– N/L sections, 2 DFTs per section of size L+M-1, plus additional multiplys
for the DFT coefficients, plus one more DFT for h[n]
2N N 
(L  M  1)   (L  M  1)  ( L  M  1) 2
2
L L 

2
– For very large N, still is proportional to M

Carnegie
Mellon Slide 7 ECE Department
The Cooley-Tukey decimation-in-time algorithm

 Consider the DFT algorithm for an integer power of 2, N  2


N 1 N 1
X[k]   x[n]WN nk   x[n]e j2nk / N ; WN  e j2 / N
n0 n0
 Create separate sums for even and odd values of n:
X[k]   x[n]WN nk   x[n]WN nk
n even n odd
 Letting n  2r for n even and n  2r  1 for n odd, we obtain
 N / 21  N / 2 1
X[k]   x[2r]WN 2rk   x[2r  1]WN  2r1 k
r 0 r 0

Carnegie
Mellon Slide 8 ECE Department
The Cooley-Tukey decimation in time algorithm

 Splitting indices in time, we have obtained


 N / 21  N / 2 1
X[k]   x[2r]WN 2rk   x[2r  1]WN  2r1 k
r 0 r 0

 j 2 2 / N
2
 But WN  e  e  j2 /( N / 2)  WN / 2 and WN2rk WNk  WNk WNrk/ 2

So … (N/ 2)1 ( N/ 2)1


X[k]   x[2r]WNrk/ 2  WNk  x[2r  1]WNrk/ 2
n0 n0

N/2-point DFT of x[2r] N/2-point DFT of x[2r+1]

Carnegie
Mellon Slide 9 ECE Department
Savings so far …

 We have split the DFT computation into two halves:

N 1
X[k]   x[n]WN nk
k 0
( N/ 2)1 ( N/ 2)1
  x[2r]WNrk/ 2  WNk  x[2r  1]WNrk/ 2
n0 n0
 Have we gained anything? Consider the nominal number of multiplications for
– Original form produces multiplications
– New form produces multiplications
 8keep going!!
NLet’s
– So we’re already ahead …..
8 2  64
2(4 2 )  8  40

Carnegie
Mellon Slide 10 ECE Department
Signal flowgraph notation

 In generalizing this formulation, it is most convenient to adopt


a graphic approach …
 Signal flowgraph notation describes the three basic DSP
operations:
x[n]
– Addition x[n]+y[n]
y[n]
a
– Multiplication by a constant x[n] ax[n]
z-1
– Delay x[n] x[n-1]

Carnegie
Mellon Slide 11 ECE Department
Signal flowgraph representation of 8-point DFT

 Recall that the DFT is now of the form X[k]  G[k]  WNk H[k]
 The DFT in (partial) flowgraph notation:

Carnegie
Mellon Slide 12 ECE Department
Continuing with the decomposition …

 So why not break up into additional DFTs? Let’s take the


upper 4-point DFT and break it up into two 2-point DFTs:

Carnegie
Mellon Slide 13 ECE Department
The complete decomposition into 2-point DFTs

Carnegie
Mellon Slide 14 ECE Department
Now let’s take a closer look at the 2-point DFT

 The expression for the 2-point DFT is:


1 1
X[k]   x[n]W2nk   x[n]e  j 2nk / 2
n0 n0

 Evaluating for k  0,1 we obtain


X[0]  x[0]  x[1]
X[1]  x[0]  e  j 21 / 2 x[1]  x[0]  x[1]

which in signal flowgraph notation looks like ...

This topology is referred to as the


basic butterfly

Carnegie
Mellon Slide 15 ECE Department
The complete 8-point decimation-in-time FFT

Carnegie
Mellon Slide 16 ECE Department
Number of multiplys for N-point FFTs

 Let N  2 where   log 2 (N)


 (log2(N) columns)(N/2 butterflys/column)(2 mults/butterfly)
or ~ N log 2 (N) multiplys
Carnegie
Mellon Slide 17 ECE Department
Comparing processing with and without FFTs

 “Slow” DFT requires N mults; FFT requires N log2(N) mults


 Filtering using FFTs requires 3(N log2(N))+2N mults
2 2
 Let 1  N log2 (N) / N ;  2  [3(N log2 (N))  N] / N
N 1 2
16 .25 .8124 Note: 1024-point FFTs
accomplish speedups of 100
32.156 .50 for filtering, 30 for DFTs!
64.0935 .297
128.055 .171
256.031 .097
1024 .0097 .0302
Carnegie
Mellon Slide 18 ECE Department
Additional timesavers: reducing multiplications
in the basic butterfly

 As we derived it, the basic butterfly is of the form

WNr

WNr N / 2
N /2
 Since WN  1 we
r
can reducing computation by 2 by
premultiplying by WN

WNr 1
Carnegie
Mellon Slide 19 ECE Department
Bit reversal of the input

 Recall the first stages of the 8-point FFT:


Consider the binary representation of the
indices of the input:

0 000 If these binary indices are


4 100 time reversed, we get the
2 010 binary sequence representing
0,1,2,3,4,5,6,7
6 110
1 001 Hence the indices of the FFT
5 101 inputs are said to be in
3 011 bit-reversed order
7 111

Carnegie
Mellon Slide 20 ECE Department
Some comments on bit reversal

 In the implementation of the FFT that we discussed, the input


is bit reversed and the output is developed in natural order
 Some other implementations of the FFT have the input in
natural order and the output bit reversed (to be described
Thursday)
 In some situations it is convenient to implement filtering
applications by
– Use FFTs with input in natural order, output in bit-reversed order
– Multiply frequency coefficients together (in bit-reversed order)
– Use inverse FFTs with input in bit-reversed order, output in natural order

 Computing in this fashion means we never have to compute bit


reversal explicitly
Carnegie
Mellon Slide 21 ECE Department
Summary

 We developed the structure of the basic decimation-in-time


FFT
 Use of the FFT algorithm reduces the number of multiplys
required to perform the DFT by a factor of more than 100 for
1024-point DFTs, with the advantage increasing with
increasing DFT size
 Next time we will consider inverse FFTs, alternate forms of the
FFT, and FFTs for values of DFT sizes that are not an integer
power of 2

Carnegie
Mellon Slide 22 ECE Department

You might also like