0 ratings0% found this document useful (0 votes) 84 views26 pagesDSP 5th Unit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
11.1 INTRODUCTION TO PROGRAMMABLE DSPs
A digital signal processor (DSP) is a specialized microprocessor designed sp
digital signal processing, generally in real time computing They contain Spe
aii’ WastraGBGA SST SO as to execute computation-intensive DSP algorithms more
‘Some advanced microprocessors may have performances close to that of P-DSPs. H
in terms of low power requirements, cost, real time V/O compatibility and
have an advantage over the
1. General purpose DSPs: These are basically high speed micro
architecture and instruction sets optimized for DSP operations. They
point processors such as Texas instruments TMS320C5X, {S320CS.
Motoral2 DSPS63X and floating point processors such as Texas im
TMS320C4X, TMS320C67XX, and analog devices ADSP21XXX.
2. Special purpose DSPs: This type of processors consist of
for specific DSP algorithms such as (ii) designed for
‘as PCM and filtering. Examples for special purpose
PDSPI6515A and UPDSP16256.
The factors that influence the selection of DSP for a given
features, execuuion and type of arithmetic and word length.
Se oleate ees, ina; ANTAGES OF DSP PRO
112 MrcROPROCESSORS
zx 815
Inroduction to DSP Processors _7=_8
ESSORS OVER CONVENTIONAL
‘The DSP processors support simultaneous fetching of multiple operands.
6 The DsP Processors have three separate computational units, Arithmetic logic unit,
multiplic ‘and. shifter,
1 DSP processors consist of
(86 The DSP processors have multi
Km The DSP processors have on cl
Powerful interrupt structure and timers.
processing ability,
hip program memory and data memory.
413 MULTIPLIER AND MULTIPLIER ACCUMULATOR (MAC)
itiplication is one of the most common operations required in digital signal
applications. An example is: convolution and correlation which require array
| operation. Implementation of the convolver with single multiplier/adder is
11.1. One of the important Tequirements_of these-array_multipliers is that
Q process_the signals in realtime. The array multiplication should be completed
‘sample of the input signal arrives at the input to the array. This requires the
n as well as accumulation to be carried out using hardware elements. There are
to solve this problem.
edicated MAC unit may be implemented in hardware, which integrates
er and accumulator in a single hardware unit, Example for this approach is
| DSP5600X.
‘Separate multiplier and accumulator, Example for this approach, is
320C5X. In this approach, the output of the multiplier is stored into the
ter and the content of the product register is added to accumulator
central ALU.
hes, the MAC operation can be completed in one clock cycle. The
pliers and or multiplier accumulator is one of the mandatory
+ Tater r%n-at}
tesponse of the sequence is given by
Tt al
lied with the array h. The vector|
*% right 90 that the (n+ I yh ,,.,
qui is obeained by shifting the array and all the elements of x, Me Shifter :
of x, becomes (i+ Ith element of +,.,. The | ta
1 position so that the ith element of + before the new product is store
product register is added to the ext jocation whose address is ‘dma + |°
Purp,
a ion of convolver with single multiplier/adder.
‘14 MODIFIED BUS STRUCTURES AND MEMORY ACCESS SCHEMES IN p.p,,
The MACD instrction, that is, the MAC operation with data move requires four me...
.. The four memory accesses/clock period Tequired {
MACD instructions are as follows:
|. Fetch the MACD instruction from the program memory.
2. Fetch one of the operands from the program memory.
3. Fetch the second operand from the data memory.
4. Write the content of the data memory with address ‘dma’ into the location with
address “dma +1".
‘The relatively static impulse response coefficients are stored in the program memo
‘and the samples of the input data are stored in the data memory.
se ain omg a o>
This architecture consists of 3 buses. T
‘exceution of each instruction
escapee
‘The number ofAddress
Figure 11.2 Von Neumann architecture.
Results/Operands
Data
memory
Address
Instructions Program
Soa o_o] ‘eo memony
{>>|
Address
Figure 11.3 Harvard architecture.
ce it has two memories, it is not possible for the CPU to
‘ gram memory and, therefore, compute the code while it is
ible. It needs two independent memory banks. These two
The processing unit consisting of the registers and
C units, multiplier, ALU, shifters, etc., are also referred to
speci, Toe POSE Raaly Pale tia
it has both piS18 _= _Drignot Signal Provenving — —=S
|
Precemne | ae .
j= i
at |
Adktcess |
oe | 1 eee
‘Commo |__.__tnsernectioons Program
_ —
Figure 11.4 Modified Harvard architecture
By using more number of buses, the number of memory accesses/clock cycle can
imcreased. Motorola DSPSG00X, DSP96002, etc. have three separate buses and 80 have
memary accesses/clock cycle. TMS320C54X has four address buses and so has four ,
accesses per clack cycle
be
115 MULTIPLE ACCESS MEMORY AND MULTIPORTED MEMORY
The different techniques adopted for increasing the number of nol
cycle are: {i} Multiple access memory and_Gi)Multiported memory.
Multiple access memory
Programmable DSP with
Wo indepenIntroduction to DSP Processors 72 819
Data bus 1
Figure 11.5 Block diagram of a dual ported memory.
Some P-DSPs combine the modified Harvard architecture with the dual ported
memories. For example, the Motorola DSPS6IXX processors have a single ported program
and a dual Ported data memory. Hence one program memory “access ai 0
pe access can be achieved per clock period.
41.6 VLIW ARCHITECTURE
Very long instruction word (VLIW) architecture is another architecture used for P-DSPs
; in TMS320C6X). The VLIW Processor consists of architecture that reads a
relatively large group of instructions and executes them at the same time. These P-DSPs
" of processing “units (data paths), In other words, they have a number of
ALUs, MAC units, shifters etc. The VLIW is accessed from memory and is used to specify
the operands and operations to bé performed by each of the data paths, The VLIW
ns that are processed per cycle. Figure 11.6
shows the block diagram of the VLIW architecture. The multiple functional units share a
common multiported register file for fetching the operands and storing the results.
siowea galt |
Multiported register file
4.8 if,
Read/Write cross bar
Sagi?$20 Digital Signal, =
cess by multiple functi )
The readiwrite eross bar provides parallel ne “ m sani Se ona iy
the multiported register file, Execution of the ee » RAM andibeliagioe mi a
concurrently with the load/store operation of dal
VLIW architecture
The performance gains that can be achieved Wit SP lista ne On ty
degree of parallelism in_the algori te
il if the algorithm involves Oe of
functional units. The throughput wl ya €Xecution a
independent operations.
ithm selected
il be higher onl
11.7 PIPELINING
used for increasing the efficiency of the
Pipelining a processor means breal aD
ich can be completed in
Instruction pipelining is a mechanism
“P-DS p
iscrete pipeline stages whi
microproce .
instruction_into
specialized hardwa d J ;
An instruction cycle starting with the fetching of an instruction and ending with
execution of instruction including the time for storage of the results can be split into 4
number of microinstructions. Execution of each of the microinstructions is also referred to a5
one phase of an instruction. For example, an instruction cycle requiring four micrs.
instructions can be said to be in four phases as follows.
Fetch phase: In this phase, the instruction is fetched from program memory.
Decode phase: In this phase, the instruction is decoded. 7
Memory read phase: In this phase, the operand required for the execution of the
instruction may be read from the data memory.
4. Execution phase: In this phase, the execution as well as storage of the results j
either one of the register or memory is carried out.
iL
a
3.
In a modern processor, the above four steps get repeated over and over’
program is finished executing. Each one of the above microinstructions may be
separately by four functional units. If we assume that each of the above phas
time for completion, then if there is no pipelining, each of the functional units is
for 25% of the time. The functional units can be kept busy almost all
Pipelining and processing a number of instructions simultaneously in the
cycle of the processor corresponds to 7, in a period of 127, only tt
pened ae a machine without pipelining, whereas in the samme
carried out with pipelining. is i
= Pipelining, Hence the throughput is increased b
The performance and Programming simplification j
with the elimination of pipeline eon a
bottlenecks in the program fetch, multiply coat “f
increased pipelining, a ;
Th :Introduction 10 DSP Processors 821
118 SPECIAL ADDRESSING MoDEs IN P-Dsps
jtion to the addressing mod, ¥
In atonal MICTOPTOCESSOT, Br direct, indirect and immediate supported by the
onl ion format and thereby ee Special addressing modes that permit single
jetrvtion pipelining. Further there are hid sa execution by making effective use of the
sadressing and bit reversed addressing that a ‘© special addressing modes such as cyclic
ty addressing modes in P-DSPs are wy aoe ally tailored for DSP applications. The
1, Short immediate addressing
2. Short direct addressing
3, Memory mapped addressing
4, Indirect addressing
5, Bit reversed addressing
6, Circular addressing
Short immediate addressing
In short immediate addressing mode, the operand is specified as a short constant that forms
part of a single word instruction. The length of the short constant depends on the
DSP and the instruction type. In TMS320CSXDSPs, an 8 bit constant can be
specified as one of the operands in the single word instruction like AND, OR, addition,
subtraction, etc.
Short direct addressing
In short direct addressing mode, the lower order address of the operand is specified in the
single word instruction. In TITMS320DSPs, the higher order 9 bits of the memory are stored
in the data page pointer and only the lower 7 bits are specified as part of the instruction.
Usi direct addressing in the Motorola DSPS600X processor, an instruction is
with a 6 bit address.
addressing
ing mode, the CPU registers and the I/O registers are accessed as memory
m in either the starting page or the final page of the memory space,
3320C5X, page 0 corresponds to the CPU registers and V/O registers.
f Motorola DSP5600X, the last page of the memory space containing 64
the memory map for the CPU and I/O registers.
1s a number of options in P-DSPs, This mode permits an array of
es fetched and stored, The address of the operands can be
called indirect address registers. In the case of TI processors,
are called auxiliary registers ARs. When the operands fetched
executed, these registers can be updated. This is made possibleDigital Signal Processing
by having an additional ALU in the ‘CPU core specifically i“ La address r.,,
of ARs, The increment or decrement of ARs can be PT aa pe : Steps Of 1 or in
specified by the content of an offset register. In P-DSPs from Cones Unesaments, ye oe
register is known as INDEX register and in the case of a e ‘sad 8 the offset Tester
known as modifier register, The content of the indirect address registers may also be ,
by a constant using bit reversed addressing mode.
Bit reversed addressing
ular decimal number is obtained by y,,
‘The binary pattern corresponding to 4 partic!
natural binary equivalent of the number in the reverse order, ‘Therefore, the least
Hl he most significant bit of the natural binary
bit of the bit reversed number becomes ae ag :
and vice versa. In this addressing mode, the address is incremented or decrementeg by
number represented in the bit reversed form.
Circular addressing mode
In real time processing of signals, the input signal is continuously stored in 7
processed data is stored in another memory space continuously and may be written on tea
output device, In this case, the input as well as output program will be simple. Ho
since the input as well as the output memory space is finite it _would be exhausted
processing the input signal for some time, if the data is written into the memory by
linear addressing mode. This problem may be overcome by checking continuously whe
the range of either the input or the output memory space is exceeded. In that case, the
data is to be stored starting from the beginning of the particular memory space. Checkj
this condition is an overhead that can be overcome using the circular addressing mode,
this mode, the memory can be organized as a circular buffer with the beginning
address and the ending memory address corresponding to this buffer designed by
programmer. In the circular addressing mode, when the address pointer is incremented,
address will be checked with the ending memory address of the circular buffer. If it exceed
that, the address will be made equal to the beginning address of the circular buffer,
—
11.9 ON-CHIP PERIPHERALS 20
The P-DSPs have a number of on-chip peripherals that relieve the
functions. Further they also help to reduce the chip count on the DSP
P-DSP. Some of the on-chip peripherals in P-DSPs and their functions
11.9.1 On-chip Timer
Two of the common applications of timers are generation of periodi
and generation of the sampling clocks for A/D converters. The
the P-DSPs, bsIntroduction to DSP Processors_%_823
is a special serial port the P-DSPs have. This port permits a P-DSP to
other devices or P-DSPs by using time division multiplexing (TDM).
e communication between th
he serial communication by using
tional lines, which are for strobing
© P-DSP and other devices to be faster
@ number of lines in parallel. In addition
or for hand shaking purpose.
mal VO ports the P-DSPs have that are sin;
Teset or read. These bits are normally us
data transfer.
igle bit wide, These port bits may
ised for control purposes, but they
Parallel port the P-DSPs have. This enables the P-DSPs to
microprocessor or a PC, which is called a host, In addition to data
he host can generate interrupts and also cause the P-DSP. to load a program
RAM on reset.
Ports
ts that are used for interprocess communication between a number of
‘multiprocessor system.
D and D/A Converters
d towards voice applications such as cellular telephones and
have A/D and D/A converters inside the P-DSPs.824 = Digital Signal Processing
11.10 P-DSPs WITH RISC AND CISC
P-DSPs may be implemented using either the RISC bates or the CISC p
relative advantages of each of these processors is given E
11.10.1 Advantages of Restricted Instruction Set Computer (RISC)
x i ly around 20% of the chip area
.A\ In RISC processor, the control unit uses on! %
‘of the reduced number of instructions. Hence the remaining area can be ug
incorporating other features. . :
AE The delayed branch and call instructions are used to improve the speed
processors. " . Rise te
3. The execution time required for all the instructions o! Processor
because all the instructions are of uniform length. 8
4. The RISC processors have smaller and simpler control units, which have fewe,
5. The speed of the RISC processor is high because of smaller control
smaller propagation delays. vical
6. Since a simplified instruction set allows for a pipelined superscalar desiop
using comparable semiconductor technology and the same clock rates.
7. Because the instruction set of a RISC processor is simpler, it uses much
space than a CISC processor. Extra functional units such as memory
units or floating point arithmetic units can be placed on the same chi ).
8. The throughput of the processor can be increased by applying Pipel
parallel processing.
9. Since RISC processors can be designed more quickly, they can take adva
new technological developments sooner than corresponding CISC design,
10. High level language (HLL) support; The programs can be written in C d
relieves the programmer from learning the instruction set of a P-DSP
increases the throughput of the programmer.
11.10.2 Advantages of Complex Instruction Set Computer (CISC)
- The CISC processors have a very rich instruction set that
language constructs similar to “if condition true then do”, “for”
2. The CISC processors have instructions specificall required fi
such as MACD, FIRS, etc. z
3. The assembly language program of a CISC Processor is very short
4. For RISC architecture compilers are essential. So this
-compilers are not required. Hence they are of low cost.
EONeW: CISC reach eS designed to be upward con
Processors. This makes the learning curve steeper.
6. Microprogramming is easier to implement and much less expensive
control unit.Introduction to DSP. Proc 7
e microprogram instruction sets can be written to match the constructs of >
| languages, the complier need not be very complicated.
each instruction is more capable, fewer instructions could be used to impl
n task. This made more efficient use of the relatively slow main memory.
TECTURE OF TMS320C50
C5X generation of the Texas instruments TMS320C50 digital signal processor
with CMOS IC technology. It is a fixed point, 16 bit processor runming-at
‘The single instruction execution time is 50 nsec. Its architectural design is based on
tion of advanced Harvard architectiire, ‘on-chip peripherals and on-chip memory.
the
;
MEMORY
Data
; DARAM
SK] /°C51 1K | (DataProgram Perpheral
Bees: — DARI B2 a
16K] |C53 3K 62x16) J++} Smt fe 6
32K | |'CS6 6K BO
BCS CK! | isi. 16 BI Saul
32K | |°LCS7 6K é A (512 16) [>| porta ae
4 ¥
TDM |
eto TOM lei 6
oe |__| [putter |.
Py | serial port
1 Leto! Timer [t= 1
Memory t | _[Hostpot], |. 1g
mapped aii [77 interface
wu
Multiplier t jem ee ae 7
accumulator] —
Auxiliary | [ACC buffer| | Parallel
register shifters ie ig
arithmetic Arithmetic (PLU)
unit logic unit
(ARAU) (ALU)
Figure 11.7 Architecture of TMS320C50.826 = Digital Signal Processing
Moreover, the TMS320C50 has a highly specialized instruction
operational flexibility and the device speed, which 108 a
the signal processor as the suitable device for @ a
The TMS320CS0 has a programmable — — berg:
application, On-chip memory includes 10K words of =
All CSX DSPs have the same CPU stricture, However, they Have
configuration and on-chip peripherals. j ib we
The functional block diagram of TMS320CX r By Figure
into four sub blocks, They are: (1) Bus structure, Central :
memory and (4) On-chip peripherals.
11.12 BUS STRUCTURE
Sey e program and data buses in the advance Harvard a e
i power and provide a high degree of parallelism. Man a
accomplished using single cycle multiply/accumulate instruction
The CS5X included the control mechanism to manage interrupts,
function calling. The “C5X’ architecture has four buses:
Program bus (PB)
Program read bus (PRB)
Data read bus (DB)
Data read address bus (DRB)
The program bus carries the instruction code and immediate:
memory to the CPU. The program address bus provides
for both read and write.
The data read bus interconnects various elements of the CPU
The data read address bus provides the address to access the data m
Bw pm
11.13 CENTRAL PROCESSING UNIT
The CPU consists of the following elements:
A\. Central arithmetic logic unit (CALU)
A. Parallel logic unit (PLU)
A. Auxiliary register arithmetic unit (ARAU)
- Memory mapped registers
» Program controller '
1.13.1 Central Arithmetic Logic Unit (CALU)
The CPU uses the CALU to perform 2's complement
- Parallel multiplier (16 x 16 bit)
_?. Accumulator (32 bit)Introduction to DSP_Processors
: buffer (ACCB) (32 bit)
register (PREG)
¢ logic unit (ALU)
it signed/unsigned multiplication operations can be performed in parallel
n one machine cycle, All multiply ee oe MPYU online
ion perform a signed multiply operation in the multiplier. One of the
multiplier is from the 16 bit temporary register © (TREGO) and the second
snput is from the program bus or data bus--The product register (PREG) holds the product.
“The 32 bit ALU along with 16 bit accumulator carries out arithmetic and logic
iting most of them in one machine cycle. Here the accumulator provides one
of the to the ALU, whereas the product register, accumulation buffer, or scaling
shifter output provides the second input. The results of operations performed in ALU are
stored if accumulator.
scaling shifter has a 16 bit input connected to the data bus and a 32 bit output
to the ALU. The scaling shifters produce a left shift of 0 to 16 bits on the input
data. A 5 bit register TREGI Specifies the number of bits by which the scaling shifter should
shift or the shift count is specified by a constant embedded in the instruction word.
11.13.2. Parallel Logic Unit (PLU)
The parallel logic unit (PLU) is another logic unit that executes logic operations on data
without affecting the contents of the accumulator. The multiplier bit in a status/eontrol
register Or any memory location can be directly set, cleared, or toggled by the PLU. After
executing the logical operation the PLU writes the result of the operation to the same
memory location from which the first operand was fetched.
11.13.3 Auxiliary Register Arithmetic Unit (ARAU)
The C5X consists of a register file containing eight auxiliary registers (ARO-AR7) each of 16
bit Iength, a 3 bit auxiliary register pointer (ARP) and an Unsigned 16 bit ALU. The
auxiliary register file is connected to the auxiliary register arithmetic unit.
Auxiliary registers
The eight 16 bit auxiliary registers (ARO-AR7) can be accessed by the CALU and modified
the U or the PLU. The primary function of ARs is to provide 16 bit address for
ing to data space or for temporary data storage. The ARs can also be used as
registers or counters. The contents of the ARs can be stored in the data
Or used as inputs to the CALU.
(INDX)
(INDX) is a 16 bit register used by the ARAU as a step value to modify
'in the ARs during indirect addressing. The INDX can be added to or subtracted828 = Digital Signal Proces
x can be used te
from the current AR or any AR update cycle. The IND: a
decrement the address in steps larger than 1.
Auxiliary register compare register (ARCR)
The ARCR is a 16 bit register used for wade vo “ind ARCR i
between the cul
and supports logical comparisons
gerne json is
CMPR instruction, The result of this compari
Block move address register (BMAR)
The BMAR is a 16 bit register that holds the
the block move. It can also hold the address value of an
multiply accumulate operation,
11.134 Memory Mapped Registers
The CSX has 96 registers mapped
mapped register space contains vari a
serial port, timer and software wait generators. Additionally
locations are mapped into this data memory space, allowing them to be a
data memory using single word instruction or as V/O locations with two:
Instruction registers (IREG)
The 16 bit instruction register (IREG) holds the op code of the
Interrupt register (IMR, IFR)
The 16 bit interrupt mask register (IMR) individually masks specific
time. The 16 bit interrupt flag register indicates the current status of the
Soe
Status registers ‘
The two 16 bit status registers contain status and control bits for t
11.135 Program Controller
The program controller contains logic circuitry that d
manages the CPU pipeline, stores the status of CPU 6j
operations. It consists of the following elements:
(i) Program counter (PC)
(ii) Status and control registers
(iii) Hardware stackogram memory addresses generation
truction registers
counter (PC)
16 bit counter which contains the address of internal or external program memory
a ‘ “
nis a The PC addresses program memory either on chip of off chip, via
program us. Through the PAB an instruction is loaded into the instruction
wet (REG). ‘Then the PC is ready to start the next instruction fetch eycle.
ip Status and control registers
‘The C5X has four status and control registers. They are circular buffer control register.
cess mode status register, status registers STO and STI
(iy Hardware stack
The stack 4s 16 bit wide and 8 levels deep and is accessible via the push and pop instructions.
The stack is used during interrupts and subroutine to save and restore the PC contents.
(iv) Program memory addresses generation
Itcontains the code for application and holds table information and immediate operands. The
program memory is accessed only by the program address bus.
() Instruction registers
The 16 bit instruction registers (IREG) hold the op code of the instruction being executed.
11.14 SOME FLAGS IN THE STATUS REGISTERS
‘The bit assignment details for flags in status registers STO and ST] are shown in Figure 11.3.
15-13 12 ot 10 9 8
| ARP | OV | OVM 1 INTM | DP |
MM. dO, 9, Ost... 6 i
{a) Status Register 0 (STO) bit assignment, (b) Status Register 1 (ST1) bit assignment.of the various bits of STO and STI are as follows:
Significance :
ARP (Auxiliary register pointer): These bits select the AR 10 be useq al
: OV (Overtion) flag bit: This bit indicates an arithmetic operation 9
a OVM (Overflow mode): This bit enables/disables the accumulator 6
mode in the ALU.
INTM (Interrupt mode): This bit globally masks or enables all interrupts,
al K NMI interrupts.
fect on the non-maskable RS and n
" ‘bse page register): These bits specify the address of the
ARB (Ausiian register buffer); This 3 bit field holds the previous value
i Cw (nak RAM configuration contol bit): This 1 bit field enables the
dual access RAM block 0 (DARAM BO) to be addressable in data memory
program memory space.
: SSE fag bit This 1 bit flag stores the results of the ALU or P
bit operations. The status of the TC bit determines if the conditional branch, call
instructions are to be executed. . iM
SXM (Sign extension mode bit): This 1 bit field enables/disables sign
arithmetic operation. 7
C (Cary bit): This | bit field indicates whether the CPU stops or conti
when acknowledging an active HOLD signal.
XF (Pin status bit): This 1 bit field determines the level of the’
output pin.
PM (Product shift mode bits): This 2 bit field determines the prod
and shift value for PREG output into the ALU.
11.15 ON-CHIP MEMORY
The CSX structure has a total memory address range of 224K words
Space is divided into four memory segments,
64K word program memory space; It contains the i
# instructio
64K word local data memory space: It stores data used by th
64K word input/output Ports: It interfaces to
; external
32K word global data memory space;
ea ; * It can share data
t
The large on-chip memory of C5x includes;
1. Program read only memory
2. Data/Program single access RAM (SAR,
3. Data/Program dual access RAM (ARAM)Introduction to DSP Processors,
IP PERIPHERALS
have the same Cpu str
nected to their CPUs, A 4
peripherals,
generator
yare timer
ero pas Stage generators
1/0 ports
‘port interface
serial port
M serial port
ost port interface
7 -maskable interrupts
ucture; however they hi i
y have different on-chip
TMS320C50 digital signal processor contains the
a6 Clock Generator
The snerator consists of an internal oscillator and a phase locked loop-(PLL)-circuit-
: enerator can be driven internally by a crystal oscillator or driven externally by a
A clock source with a frequency lower than that of the CPU can be used
circuit can generate an internal CPU clock by multiplying the clock source
ardware Timer
4 bit prescaler clocks at a rate that is between 1/2_
machine cycle rate, depending upon the fimer’s divide down ratio. It acts as
producing interrupts to CPU at regular intervals. The timer can be stopped,
cor disabled by specific status bits. Three registers namely the timer counter
i ister (PRD) and timer control register (TCR) control and
er register gives the current count of the timer, The timer
The timer control register controls the
mable hardware timer with
it-state Generators
i i ternal
nab it-state generators can be interfaced without_any_ exten
el os pees and 1/0 devices. This feature consists of multiple
: circuits. Each circuit is us r-programmable to operate in different wait
accesses. i
e-programmable Wai832_= Digital Signal Processing
11.164 General Purpose /O Pins
mety branch control input (BIO) pin ang
There are two general purpose 1/0 pins nat .
flag output (XF) pin that are controlled by software. The pin BIO nes tk of the a
external devices and executes conditional branches ene pis : .
XF communicates with external devices through software. The in ETCXE .t
pin to high and CLRCXF resets to 0.
11.165 Parallel 1/0 Ports
There is a total of 64K parallel 1/0 ports of which 16 are memory mapped in data
space. All the I/O ports can be addressed through the instructions IN. OUT o
data memory read and wnite instructions There is a signal IS ed en ans
of memory mapped W/O space to that of program and data space. Interfacing with .
WO devices can be done through the 1/0 ports with minimal off-chip address dee,
circuits.
11.16.6 Serial Port Interface
Three different kinds of serial ports are available: a general purpose serial pon,
division multiplexed (TDM) serial port and a buffered serial port (BSP). Each C5X cones
atleast one general purpose high-speed synchronous, full duplexed serial port interface.
serial port is capable of operating at upto one fourth the machine cycle rate.
The serial port control (SPC) register, the data receive register (DRR), the data
receiver (DXR), the data transmit shift register (XSR) and the data receive shift regi
(RSR) control and operate the serial port interface.
The serial port control register contains the mode control and status bits of the se
port. The data receive register holds the incoming serial data. The data transmit shift regis
controls the shifting of the data from the data transmit register to the output pin.
‘The data receive shift register controls the storing of the data from the input pin to
data receive register. ae
11.16.7 Buffered Serial Port
The buffered serial port (BSP) is available on the C56 and C57 d
double-buffered serial port and an auto buffering unit, The BSP
data stream Jength, The ABU supports high-speed data tran
latencies. Five BSP registers control and operate the BSP.
SR
11.16.8 TDM Serial Port
This is the third type of serial port that is utilized by the d
duplexed serial port that can be configured by oft
TDM operation. The TDM serial port is commonly
liliInterface
57. It is an 8 bit parallel /O port that provides
Bs en talon is exchanged sae the DSP and the host
zh i
At IS accessible to both the host Processor and the
KABLE INTERRUPTS
pt lines (INT, INT2, INTS.
PM : , INT4) and five internal interrupts, a ti
serial port interrupts are user maskable. 7 ae
pt service routine (ISR) is executed, the cont
, tents of the program
an 8 level hardware stack and the contents of 11 specific CPU registers
deep stack. When a return from interrupt instruction is executed, the CPU
are restored.
Interrupt Types Supported by TMS320C5X Processor
55X devices have four external, maskable user interrupts (INT4 - INT1) that
N use to interrupt the processor; there is one external nonmaskable
internal interrupts are generated by the serial port (RINT and XINT),
TABLE 11.1 Interrupt locations and priorities
Location Priority Function
Hex
1 (Highest) External reset signal
3 External interrupt line-1
4 External interrupt line-2
5 External interrupt line-3
6 Internal timer interrupt
7 Serial port receive interrupt
8 Serial port transmit interrupt
9 TDM port receive interrupt
10 TDM port transmit interrupt
External user interrupt jine-4 &
Reserved
TRAP instruction vector8M =
wal Sena! Prvvscane
the timer (TINT), the TDM port (TRNT and TXNT). and the software ime
(TRAP, NMI, and INTR). Interrupt peiorities are_set 5°
Priority and INTS has the lowest priority. The NMI call
Vectorelative locations and priorities for all pare “i interrapes are
in Table 11.1. No priority is set for the TRAP instrecsio® 7 wal ~~
own vector location, Each interrupt address has been spaced apa locations
branch instructions can be accommodated in those locations.
11.18 ON-CHIP PERIPHERALS AVAILABLE IN TMS320C3X PROCESSOR
The TMS320C3X generation is the first of Texas Instruments 32 bit floating-potst Dgp
CX devices provide an easy-to-use, high performance architecture and cam be seg |,
wide variety of areas including automotive applications, digital audio, industrial
and control, data communication, and office equipment that include
Peripherals, copiers and laser printers. ‘ 5
The C3X processors peripherals include serial ports, timers and on-chip DMA
Memory Access) controllers. The timers and DMA controllers are discussed below.
Timers ie
The C3X timers are general-purpose 32 bit timers/event counters. Figure
block diagram of the timer. The timer operates in two signaling modes,
clocking. With internal clock we can use the timer to signal external dey
AVD converter, D/A converters etc. When extemal clock is applied to the ti
the extemal events and interrupts the CPU after a specified number of ev
has an V/O pin that can be configured as an input, output or general Q
are used by the timer. These registers can be accessed using the
values.| Internal clock 2
Counter
(32 bit)
—-—
Period register
Conner fegister
(31-0)
GI) |
Comparator
Period = Counter
|
Pulse generator
ee TSTAT
Timer out836 = Digital Signal Processi
Port control register
Port control register
Port control register
Port control register
Pp
E
R
1
PP:
Hw
E
R
A
L
A
D
D
R
E
s
8
B
U
Ss
Figure 11.10 Block diagram of DMA controller.
SHORT QUESTIONS WITH ANSWERS Se An
L i, three leading manufacturers of P-DSPs,
Texas instruments (TI), Anal i
| pa log Devices, and Motorola are the three leading
d microprocessors with
ns. They include fixed
are TMS320C5X.Introduction to DSP Processors
ay © dhe the selection of DSP for a given applicatio ?
tural features, exec luence the selection of DSP for a given application are:
» execution speed and type of arithmetic and word length,
any operation that involves an off-chi
ff-chi i compared
he on-chip memory in P-DSPs? ee ee
i es only for connecting the on-chip memory 10
N lor accessing off-chip memory only a single bus
d for ie ss “ Program memory and the data memory. Hence any
on rs OH cI
hpsh. off-chip memory is slow compared to that using the on
hat is an instruction cycle?
A "oa cycle is the time that elapses since an instruction is fetched
am “instruction completes execution including the time taken for
result into a register or memory.
the number of memory accesses per clock cycle be increased?
number of memory accesses per clock period can be increased by using
memory or multiport memory.
architectures used for P-DSPs.
me architectures used for P-DSPs are: (i) Von Neumann architecture,
rvard architecture, (iii) Modified Harvard architecture, and (iv) VLIW
P-DSP which uses VLIW architecture.
P-DSP which uses VLIW architecture is TMS320C6X.
nit consisting of the registers and processing elements such
etc., is referred to as data path.
does the P-DSPs follow?
Ps follow the modified Harvard architecture.
proach for increasing the efficiency of P-DSPs. 4
ch for increasing the efficiency of P-DSPs is instruction pipelining.
pipeline?
ly in the CPU arei“ is the use of TDM serial port? to communicate with other dey,
ons serial port permits a P-DSP ee
ait # ae ae time een multiplexing: 7
18 What is the use of single bit VO Ports? nirat purposes but they can 4), be
sed
iran aes ic oaae bit are also used for conditional branch,
Bae 7 7
calls.
z , by host port?
ae ona oer pat i a sped parallel port in ea bem ty
cate with a microprocessor OF PC.
‘What function of the host port? .
= a Ans, th sition to a epininIeato the host port can generate interryp,,
P ‘also cause the P-DSP to load a program from ROM to the RAM on reser,
mean by comm ports?
ports are par
between a number of identical
‘In RISC processors what percentage of chip area may be used for the contro Unit
cIsc what percentage?
r 20% of the chip area may be used for the contro| Uni
processors 30 to 40% may be used. 7
buses does C5X architecture have? Name them.
has four buses and they are: Program bus (pg)
), Data read bus (DB), Data read address bus (DAB),
am bus (PB)?
oe eee y , ;
¥ ae eeu a instruction code and immediaw
iddress bus (PAB)?
ee it provides addresses to program
MTA 115% r
‘ang
allel ports that are used for Interproce
1 P-DSPs in a multiprocessor 5...
ects various elements of ti
Ao»
the address to access th
Abbe
e following elemen'’:
(ACC), accumula
right barrel shifte-register ALU (ARAU
and an unsigned ALU
use of INDX?
, index register INDX is used by the ARAU
, the ARs during indirect addressing
use of ARCR?
) consists of auxiliary register, auxiliary
as 4 step value to modify the
use of BMAR?
block move address re
and multiply/accumul
function of PLU?
PLU performs Boolean Operations or the bit manipulations required of
‘controllers. It can set, clear, test or to,
egister, or any data memory location.
data memory values directly without
Bister holds an address value to be used with
late operations.
ggle bits in a status register
It allows logic operations t be
t affecting the control of the ACC
ae use of program controller?
program controller contains logic Circuitry that decodes the
he CPU pipeline, stores the status of CPU operations and
ions.
nstructior
decodes the
am controller consist of?
m controller consists of the following elements: program counter
ers, hardware stack, address generation logic, instruction registe:
Tegister and interrupt mask register.
ges of DSP processors over conventional processors?
itation of convolver with single multiplier/adder.
Structures and memory access schemes in digital signal
n explain the dual port memory
adopted for increasing the number of memory
VLIW architecture.
ined using VLIW architecture, Give an
ub