FSMD
-- counter to generate 10 ms tick ( 2^N * 20ns )
-- debouncing FSM
-- state register
37
FSMD
-- next-state/ out put logic
38
1
FSMD
We used an FSM and a timer (which is a regular sequential circuit), it is not
based on the RT methodology because the two units are running independently
and the FSM has no control over the timer. The waiting period in this design is
between 20 and 30 ms but is not an exact interval. This deficiency can be
overcome by applying the RT methodology.
• The circuit include two output signals: db-level, which is the debounced output, and db-tick,
which is a one-clock-cycle enable pulse asserted at the zero-to-one transition.
• The zero and one states mean that the sw input has been stabilized for '0' and 'l' , respectively.
• The wait_l and wait_0 states are used to filter out short glitches.
• The data path contains one register, q, which is 21 bits wide.
• When the sw input signal becomes '1', the FSMD moves to the wait_1 state and initializes q to
"1 . . . ". In the wait_l state, the q decrements in each clock cycle. If sw remains as '1', the
FSMD returns to this state repeatedly until q reaches "0 . . . 0 " and then moves to the one
state.
39
FSMD
ASMD chart of a debouncing circuit. 40
2
FSMD
• Code with explicit data path components
The key data path component custom 21-bit decrement counter that can:
• Be initialized with a specific value
• Count downward or pause
• Assert a status signal when the counter reaches 0
q-load : signal to load the initial value in binary counter
q-dec : signal to enable the counting.
q-zero : status signal, which is asserted when the counter reaches zero.
• The complete data path is composed of the q register and the next-state logic of the custom
decrement counter. A comparison circuit is included to generate the q-zero status signal.
• The control path consists of an FSM, which takes the sw input and the q-zero status and
asserts the control signals, q-load and q-dec, according to the desired action in the ASMD
chart.
41
De-bouncing circuit with an explicit data path component
library ieee; -- FSMD state & data registers
use ieee.std_logic_1164.all; process(clk, reset)
use ieee.numeric_std.all; begin
entity debounce is if reset='1' then
port( state_reg <= zero;
clk, reset: in std_logic; q_reg <= (others=>'0');
sw: in std_logic; elsif (clk'event and clk='1') then
db_level, db_tick: out std_logic state_reg <= state_next;
q_reg <= q_next;
);
end if;
end debounce ; end process;
architecture exp_fsmd_arch of debounce is -- FSMD data path (counter) next-state logic
constant N: integer:=21; -- filter of 40ms q_next <= (others=>'1') when q_load='1' else
type state_type is (zero, wait0, one, wait1); q_reg - 1 when q_dec='1' else
signal state_reg, state_next: state_type; q_reg;
signal q_reg, q_next: unsigned(N-1 downto 0); q_zero <= '1' when q_next=0 else '0';
signal q_load, q_dec, q_zero: std_logic;
begin
42
3
-- FSMD control path next-state logic
process(state_reg, sw, q_zero)
begin
q_load <= '0'; when one =>
q_dec <= '0'; db_level <= '1';
db_tick <= '0'; if (sw='0') then
state_next <= state_reg; state_next <= wait0;
case state_reg is q_load <= '1';
when zero => end if;
db_level <= '0'; when wait0=>
if (sw='1') then db_level <= '1';
state_next <= wait1; if (sw='0') then
q_load <= '1'; q_dec <= '1';
end if; if (q_zero='1') then
when wait1=> state_next <= zero;
db_level <= '0'; end if;
if (sw='1') then else -- sw='1'
q_dec <= '1'; state_next <= one;
if (q_zero='1') then end if;
state_next <= one; end case;
db_tick <= '1'; end process;
end if; end exp_fsmd_arch;
else -- sw='0'
state_next <= zero; 43
end if;
Simple Processor: Design
44
4
Simple Processor
• R0, R1, R2, R3: n-bit registers
• Extern: control signal to enable n-bit external data input
• AddSub: 0 Sum A+B
1 difference A-B [2’s compliment of B]
• G: store output of add/sub with Gin, Gout control signals
• Operations performed in processor
• Load and Move operations require only one step (clock-cycle, one
transfer across the bus)
45
• Add and Sub operations require three steps
Simple Processor
• Function: control signal input to
perform specific operation
initiated by setting w=1
• Done: output, when operation is
completed
• Control circuit is based on a two
bit counter since longest
operation needs three steps
• Each of the decoder outputs
represents a step in an operation.
• T0, T1, T2, T3: no-operation,
step_1 to step_3
46
5
Simple Processor
• Function input: to
specify the operation
with six bits
• f1f0: opcode
• 00-Load,
• 01-Move,
• 10-Add,
• 11-Sub
• Rx1Rx0:operand Rx
• Ry1Ry0:operand Ry
• Function Register:
store function inputs
when FRin is asserted
47
Simple Processor
Clear and FRin for all the operation
• Clear :
• ensure 00 count value as long as w=0 and no-operation is
being executed;
• clear the count value at the end of each operation
Clear = wT0 + Done
• FRin :
• load the values on the Function inputs into the Function
Register when w changes to 1
FRin = wT0
48
6
Simple Processor
• Rest outputs from control circuit depend on the specific step
being performed in each operation
Control signals asserted in each operation/time step 49
Simple Processor
Logic expressions for outputs of control circuit
Signals asserted in Add and Values of RX0in : R0in … R3in
Sub operations RX0out : R0out … R3out
Extern = I 0T1
Done = ( I 0 + I1 )T1 + ( I 2 + I 3 )T3 R0in = ( I 0 + I1 )T1 X 0 + ( I 2 + I 3 )T3 X 0
Ain = ( I 2 + I 3 )T1 R0out = I1T1Y0 + ( I 2 + I 3 )(T1 X 0 + T2Y0 )
Gin = ( I 2 + I 3 )T2
Gout = ( I 2 + I 3 )T3
AddSub = I 3
R 0in = ( I 0 + I1 )T1 X 0 + ( I 2 + I 3 )T3 X 0
50
7
Simple Processor: VHDL Code
• Components:
• upcount
• regn
• trin
• dec2to4
counter and decT instantiate the circuit
51
Simple Processor: Components
N-bit tri-state buffer
52
8
Simple Processor: Components
N-bit register
53
Simple Processor: Components
2-bit up counter in
synchronous reset
54
9
Simple Processor: Components
COMPONENT : a piece of conventional code (LIBRARY declarations +ENTITY+ ARCHITECTURE).
By declaring such code as being a COMPONENT, it can then be used within another circuit.
55
Simple Processor: Package
Frequently used pieces of
VHDL code such as
COMPONENTS, FUNCTIONS,
or PROCEDURES are placed
inside a PACKAGE and
compiled into the destination
LIBRARY. it allows code
sharing and code reuse.
Package can also contain
TYPE and CONSTANT
definitions.
56
10
Simple Processor: Package
Ways of declaring COMPONENTS: (a) declarations in the main code itself,
(b) declarations in a PACKAGE.
57
Simple Processor: Package
58
11
Simple Processor: VHDL Code
• Reset: to initiate the counter to 00
• Func: six-bit signal (F & Rx & Ry), input to Function
register
59
Simple Processor: VHDL Code
60
12
Simple Processor: VHDL Code
61
Simple Processor: VHDL Code
Instantiation of
Function Register
with data inputs
Func and the
outputs FuncReg
Logic expressions
for the output of
control circuit
62
13
Simple Processor: VHDL Code
Instantiation
of tri-state
buffers and
registers
63
Simple Processor: VHDL Code
Description of
adder/subtraction module
64
14
Simple Processor: VHDL Code: example results
• Start of an operation
at each ↑ and w=1
• At 250ns:
Load R0, data
• Next
Load R1, data
• Next
Load R2, data
• At 850ns
Add R1,R0
• At next ↑
R1(55) appear on bus
• At 950 ns
55 is loaded into A
and R0 (2A) on bus
Adder generate sum
7F,
• At 1050ns
Loaded into G
After this ↑, G (7F) is
placed on bus
• At 1150 ns
7F is loaded into R1
• Move R3,R1 65
• Sub R3,R2
Embedded System: Computer System
A computer is an electronic machine
which performs some computations.
To have this machine perform a task‚
the task must be broken into small
instructions‚ and the computer will be
able to perform the complete task by
executing each of its comprising
instructions.
Each computer should have a CPU to
execute instructions‚ a memory to
store instructions and data‚ and an IO
device to transfer information
between the computer and the Von-Neumann Machine
outside word.
66
15
Computer System: Software
• The part of a computer system
that contains instructions for the
machine to perform is called its
software.
• Ways to describe a computer
software
• Machine Language
• Assembly Language
• High-Level Language
67
Computer System: ISA
• Instruction Set Architecture:
specifies the interface between
hardware and software of a
processing unit. The ISA
provides the details of the
instructions that a computer
should be able to understand
and execute.
Computer System Components
Instruction Format
68
16
Computer System:
Simple CPU Design: Single-Cycle Implementation
CPU Specification:
CPU External Busses- 16-bit external data bus and a 13- bit address bus.
General Purpose Registers- a 16-bit register: accumulator (acc).
Memory Organization- Capable of addressing 8192 words of memory; each
word has a 16 bit width.
Instruction Format-Each instruction is a 16-bit word and occupies a memory
word. Operands: An Explicit operand (the memory location whose address is
specified in the instruction)‚ and an implicit operand (acc).
The CPU has a total of 8 instructions‚ divided into three classes:
• Arithmetic-logical instructions (ADD‚ SUB‚ AND‚ and NOT)
• Data-transfer instructions (LDA‚ STA)
• Control-flow instructions (JMP‚ JZ) 69
Computer System: Simple CPU Design
• Addressing Mode: Direct Addressing
• Instruction Set:
70
17
CPU Design:
Single-Cycle Implementation: datapath design
• Datapath design is an incremental process‚ at each
increment, consider a class of instructions and build up a
portion of the datapath.
• Then, in steps, combine these partial datapaths to
generate the complete datapath.
• In the steps‚ decide on the control signals that control
events in the datapath.
• In the design of the datapath‚ concern with how control
signals affect flow of data and function of data units in
the data path‚ and not how control signals are generated.
71
Simple CPU Design
Step 1: Program Sequencing
•Instruction Fetch (IF): Reading an
instruction from the memory
•Instruction memory: stores the instructions.
•Program Counter: A 13-bit register.
The register to hold the address of the current
instruction to be read from the instruction
memory. After the completion of the current
instruction‚ the PC should be incremented by
one to point to the next instruction in the Program Sequencing Datapath
instruction memory.
72
18
Simple CPU Design
Step 2: Arithmetic-Logical Instruction
Data-Path.
•First operand is acc . Second operand
read from the data memory
•adr field of the instruction points to the
memory location that contains the
second operand (bits 12 to 0)
•Result of the operation is stored in acc
•arithmetic-logical unit (alu) performs the
operation of arithmetic and logical
instructions
•alu operation is controlled by a 2-bit
input‚ alu_op
•alu is designed as combinational circuit 73
Simple CPU Design
Step 3: Combining the Two Previous Datapaths
• Connect the address input of the data memory to the adr field (bits 12 to 0) of
the instruction which is read from the instruction memory.
• Combined datapath is able to sequence the program and execute arithmetic
or logical instructions
74
19
Simple CPU Design
Step 4: Data-Transfer Instruction
Datapath: LDA and STA
• LDA uses adr field of the instruction to
read a 16-bit data from data memory and
store it in acc register
• STA instruction writes the content of acc
into a data memory location pointed by the
adr field
• data memory have two control signals‚
mem_read and mem_write for control of
reading from it or writing into it.
• In order to control accumulator writing‚
the acc_write (write-control‚ or clock
enable) signal is needed 75
Simple CPU Design
Step 5: Combining the Two Previous Datapaths
• Combining the two datapaths‚ may result in multiple connections to the
input of an element
• To have both connections‚ a multiplexer (or a bus) is used to select one of
the sources.
76
20
Simple CPU Design
Step 6: Control-Flow Instruction Datapath.
• JMP: an unconditional jump; writes the adr
field (bits 12 to 0) of an instruction into pc.
• JZ: a conditional jump; writes the adr field
into pc if acc is zero.
• Checking for the zero value of acc‚ a NOR
gate on the output of this register generates
the proper signal that detects this condition
for execution of the JZ instruction.
Control-Flow Instructions Datapath
77
Simple CPU Design
Step 7: Combining the Two Previous Datapaths
• To select the appropriate source for pc input; multiplexer select input is pc_src.
Simple CPU Datapath 78
21
Simple CPU Design
Instruction Execution : Example instruction- ADD 100
• On the rising edge of the clock, a new value is written into pc, and pc points
the instruction memory to read the instruction ADD 100.
• Memory read operation is complete and the controller starts to decode the
instruction.
• Controller issue the appropriate control signals to control the flow of data
in the datapath.
• On the next rising edge of the clock, the alu output is written into acc to
complete the execution of the current instruction and the pc+1 is written
into pc. This new value of pc points to the next instruction.
• Since the execution of the instruction is completed in one clock cycle‚ the
implementation is called single-cycle implementation.
79
Simple CPU Design
Single-Cycle Implementation – Controller Design
• Controller issues the control signals based on the opcode field of the instruction.
• opcode field will not change while the instruction is being executed
• Control signals will have fixed values during the execution of an instruction
• Controller is implemented as a combinational circuit
• Controller issues all control signals directly‚ except pc_src‚ which is issued using a simple
logic circuit.
• For all instructions‚ except JMP and JZ‚ both jmp_uncond and jmp_cond are 0. Hence the
Jump Logic block produces a 0 on pc_src that causes pc to increment.
• For the JMP instruction‚ the jmp_uncond becomes 1‚ and this puts a 1 on the pc_src and
directs the adr field of the instruction into the pc input.
• For the JZ instruction‚ the jmp_cond is asserted and if the acc_zero is 1 (when all bits of acc are
0‚ the acc_zero becomes 1)‚ the address field of the instruction from the instruction memory is
put into the pc register. Else if acc is not 0‚ the pc+1 source of pc is selected. 80
22
Simple CPU Design
81
Simple CPU Design
Status of control signals for controlling flow of data in datapath.
Arithmetic-Logical Class:
• mem_read=1 to read an operand from data memory
• acc_write=1 to store the alu result in acc
• alu_op becomes 00‚ 01‚ 10‚ or 11 for ADD‚ SUB‚ AND‚ and NEG
• acc_src=0‚ to direct alu output to the acc input
• jmp_cond‚ and jmp_uncond are 0 to direct pc+1 to the pc input
Data-Transfer Class:
LDA Instruction:
• mem_read=1 to read an operand from the data memory
• acc_write=1‚ to store the data memory output in acc
• alu_op is XX‚ because alu has no role in the execution of LDA
• acc_src=1‚ to direct data memory output to the acc input
• jmp_cond‚ and jmp_uncond are 0 to direct pc+1 to the pc input 82
23
Simple CPU Design
STA Instruction:
• mem_write=1 to write acc into the data memory
• acc_write=0‚ so that the value of acc remains unchanged
• alu_op is XX because alu has no role in the execution of STA
• acc_src=X‚ because acc writing is disabled and its source is not important
• jmp_cond‚ and jmp_uncond are 0 to direct pc+1 to the pc input.
Control-Flow Class:
JMP Instruction:
• mem_read & mem_write are 0‚ because JMP does not read from or write into data
memory
• acc_write=0‚ because acc does not change during JMP
• alu_op is XX because alu has no role in execution of JMP
• acc_src=X‚ because acc writing is disabled and its source is not important
• jmp_cond=0‚ jmp_uncond=1‚ puts 1 on pc_src and directs jump address(12- 0) to pc83
Simple CPU Design
JZ Instruction:
• mem_read and mem_write are 0‚ JZ does not read from or write into the data memory
• acc_write=0, acc does not change during JZ
• alu_op equals to XX‚ because alu has no role in execution of JZ
• acc_src=X‚ because acc writing is disabled and its source is not important
• jmp_cond=1‚ jmp_uncond=0‚ if acc_zero is 1‚ puts 1 on pc_src and directs jump address
(bits12 to 0) to pc input. Else value of pc_src is 0 and pc+1 is directed to pc input.
84
24
Simple CPU Design
Controller Truth Table
The controller (single-cycle implementation) of the system can be designed as a
combinational circuit using a truth table.
85
CPU Design:
Multi-Cycle Implementation
Single-cycle implementation: used
• Two memory units
• Two functional units (the alu and the adder)
Multi-cycle implementation: reduces the required hardware
• Hardware can be shared within the execution steps of an instruction
• Each instruction is executed in a series of steps, each takes one clock
cycle to execute
86
25
CPU Design:
Multi-Cycle Implementation: datapath design
• Use a single memory unit which stores both instructions and data‚
• Use a single alu which plays the role of both alu and the adder.
Sharing hardware adds one or more registers and multiplexers
• one multiplexer: To choose between the address of the memory unit from the pc
output (to address instructions) and bits 12 to 0 of the instruction (to address data)
• two multiplexers: at the alu inputs;
• first alu input‚ chooses between pc and acc
• second alu input, chooses between memory output and a constant value 1
alu inputs are 16 bits wide‚ so append 3 zeros on the left of pc to make 16-bits
• instruction register (ir): To store the instruction read from the memory
87
Multi-Cycle Implementation: Datapath design
88
26
Multi-Cycle Implementation:
Steps for execution of the instructions:
Instruction Fetch (IF): Read the instruction from memory and increment pc.
memory-read operation - pc addresses the memory for memory-read operation and
write the instruction into ir.
pc increment - apply pc to the first alu input‚ and the constant value 1 to the second
alu input‚ perform an addition‚ and store the alu output in pc
Instruction Decode (ID): Controller decodes the instruction (stored in ir) to issue the
appropriate control signals.
Execution (EX): Datapath operation in this step is determined by the instruction class:
Arithmetic-Logical Class: Apply bits 12 to 0 of ir to the memory‚ and perform a
memory read operation. Apply acc to the first alu input‚ and the memory output
to the second alu input‚ perform an alu operation (addition‚ subtraction‚ logical
89
and‚ and negation)‚ and finally store the alu result into acc.
Multi-Cycle Implementation
Data-Transfer Class: Apply bits 12 to 0 of ir to the memory
LDA- perform a memory read operation‚ and write the data into acc.
STA- instruction: perform a memory write operation to write acc into the memory.
Control-Flow Class:
JMP- write bits 12 to 0 of ir to pc.
JZ - write bits 12 to 0 of ir to pc , if the content of acc is zero.
Controller Design:
• Controller specify the appropriate control signals for each step
• Controller of a multi-cycle datapath is designed as a sequential circuit
90
27
Datapath & Controller inter-connection
Multi-Cycle Implementation:
91
Multi-Cycle Implementation: Controller design
Controller design
as a Moore finite
state machine
•each state issues
appropriate control
signals and specifies
the next state
•transition between
states are triggered
by the edge of the
clock
•all control signals
in a state are issued
by entering the
state 92
28
Multi-Cycle Implementation: Controller design
State 0: IF step :
• pc is applied to memory address input (i_or_d=0)‚
• instruction is read from memory (mem_read=1),
• instruction is written into ir (ir_write=1)‚
• pc is incremented by 1 (src_a=0‚ src_b=1‚ alu_op=00‚ pc_src=1‚ pc_write_uncond=1).
State 1: give enough time to the controller to decode the instruction‚
• no control signal is asserted
• instruction decoding specifies the next state according to the type of the instruction
being executed
Next State:
Arithmetic-logical instruction / LDA instruction /STA instruction /JMP instruction
/JZ instruction:
93
29