Input Output
Input Output
Structure
4.0 introduction
4.1 Objectives
4.2 lnputloutput Module
4.2.1 Functions of 110 Module
4.2.2 Structure of 110 Module
4.3 Inputloutput Techniques
4.3.1 Programmed Inputloutput
4.3.2 Interrupt driven Inputloutput I
i
4.4 Direct Memory Access
4.5 Inputloutput processors
4.6 External Interface
4.7 Summary
4.8 Model Answers
4.0 INTRODUCTION
Till now we have discussed about various components and memory system for a computer. We
have also discussed one interconnecting structure called the Bus. In this unit we will discussed
briefly about input/output devices, then move on to the function and structure of an Input/output
module, then we will discuss inputloutput techniques and at the end we will discuss about the
Input/output processors which were quite common in mainframe computers. Finally we will discuss
about two popular device interfaces. This unit is the last unit of the block and further discussion on
the Central Processing Unit will be taken up in the next block i.e., block-2 of the course.
4.1 OBJECTIVES
describe the three types of InputIOutput techniques, viz., Programmed Input/Output Interrupt
driven Input/Output and Direct Memory Access; and
The input/output /nodule (I/O module) is normally connected to the system Bus of the system on one
end and one or more InputIOutput devices on the other. Normally an 110 module can control one or
more peripheral devices.
Is the I10 module mere a connector of InputIOutpi~t(110) devices to system bus? InputIOutput
Organisation
No it performs additional functions such as communicating with tlie CPU as well as with the
peripherals. But why to use I10 module at all why not connect a peripheral directly to the system
bus? Well there are three main reasons for this:
a) Diversity and variety of I10 devices makes it difficult to incorporate all the peripheral device
logic (i.e. its control commands, data format etc.) into CPU. This in turn will also reduce
flexibility of using any new development.
b) The I10 devices are normally slower than that of memory and CPU, therefore, it is suggested
not to use them on high-speed bus directly for communication purposes.
c) The data format and word length used by the peripheral may be quite different than that of the
CPU.
The need of 110 from various I10 devices by tlie CPU is quite unpredictable. In fact it depends
on I10 needs of particular progranis and normally do not follow any pattern. Since, tlie I10
module also share system Bus and memory for data Input/Output. Therefore, control and timing
are needed to coordinate the flow of data fromlto external devices tolfrom CPU or memory. A
typical control cycle for transfer of data from 110 device to CPU is:
Enquire status of an attached device from I10 module. The status can be busy, ready or out
of order.
I10 module responds back with the status of tlie device.
If device is ready, CPU commands 110 module to transfer data from the 110 device.
I10 module accepts data from tlie I10 device.
The data is then transferred from 110 module to the CPU.
'The exa~nplegive11above clearly specifies tlie need of communication between tlie CPU and
I10 module. This co~iiiiiunicationcan be: commands such as READ SECTOR, WRITE
SECTOR, SEEK track number (which are issued by CPU to I10 module); or Data (which may
be required by tlie CPU or transferred out); or Status information such as BUSY or READY or
any error condition from I10 modules. The status recognition is necessary because of tlie speed
gap between the CPU and 110 device. An 110 device might be busy in doing tlie I10 of previous
instruction when it is needed again.
Another important communication from ilie CPlJ is tlie i~niqueaddress of tlie peripheral from
which I10 is expected or is to be co~itrolled.
Hardware c) It should communicate with the 110 device.
Concepts
Communication between I10 module and I10 device is needed to complete tlie I10 operation.
This communication involves commands, status or data.
Data buffering is quite useful for tlie purpose of smoothing out the gaps in speed of CPU and
110 devices. The data buffers are registers, which hold the 110 information temporarily. The I10
is performed in short bursts in which data is stored in buffer area while tlie device can take its
own time to accept it. 110 device to CPU transfer, data is first transferred to the buffer and then
passed on to CPU from these buffer registers. Thus, tlie I10 operation does not tie up the bus for
the slower 110 devices.
e) Error detection mechanism should be in-built: Tlie error detection meclianism niay involve
checking the mechanical as well as data co~nmunicationerrors. These errors should be reported
to the CPU. The examples of the kind of mechanical errors that can occur in devices are paper
jam in printer, mechanical failure, electrical failure etc. The data conimunication errors may be
checked by using parity bit.
There is a need of I10 logic, which sliould interpret and execute the dialogue between the CPU
I
and I10 module. Therefore, there need to be control lines between CPU and this I10 module. In
addition, the address lines are needed to recognise tlie address of the I10 module and its specific
device.
Tlie data lines connecting 110 module to system bus must exist. These lines serve tlie purpose of
data transfer.
Data registers may act as buffer bet\,.een CPU and I10 module.
Tlie interface of I10 module with the device sliould have interface logic to control the device, to
I
get tlie status information and transfer of data.
Figure 1 shows the diagram of a typical 110 module wliicli in addition to all the above have
statuslcontrol registers which might be used to pass on the status information or can store control
i~iformation.
Intcrfacc with 110
Interface with device
System Bus
External f--+ Data
Data Line Device
~ntcrface ++ Status
Logic Control
INPUTJOUTPETT TECHNIQUES
I The Input/ Output operations can be performed by three basic techniques. These are:
Programmed Inputloutput
Interrupt driven InputlOutput
Direct Memory Access
Another technique suggest to reduce the waiting by CPU is interrupt driven 110. The CPU issues
the I10 command to 110 module and starts doing other work, which may be execution of a separate
program. When the I10 operation is complete, I10 module interrupts CPU by informing the CPU
that 110 has finished. CPU, then, may proceed execution of this program.
In .both programmed I10 and interrupt driven I10 CPU is responsible for extracting data from the
memory for Output and storing data in memory for input. Such a requirement does not exist in
DMA where the memory can be accessed directly by I10 module. Thus, the I10 module can store or
extract data inlfrom the memory. We will discuss programmed I10 and interrupt driven I10 in this
section and about DMA in the next section.
In case the I10 is solely looked after by a separate dedicated processor. Then this is referred to as
I10 channel or I10 processor. The basic advantage of these devices is that they free CPU of the
burden of InputIOutput. Thus, during this time CPU can do other work, therefore, effectively
increasing the CPU utilisation. We will discuss about 110 channel and I10 processors in the next
section.
I n addition, in a programmed 110 method the responsibility of CPU is to constantly check the status
of the I10 device to check whether it has become free (in case output is desired) or it has finished
inputting the current series of data (in case input is going on). Thus, Programmed 110 is a very time
consuming method where CPU wastes lot of time for checking and verifying the status of an I10
device. Let us now try to focus how this Input-Output is performed. Figure 3(a) gives the block
diagram of transferring a Block of data word by word using programmed 110 technique.
CPU issues a read or write b
command to UO module
Ready
d u -
Execute next
Instruction
+
CPU p r f o m s
the next lnstruct~on
(c) DMA
UO Instructions: To carry out inputloutput CPU issues I10 related instruction. These instructions
consist of two components:
the address of the TnputlOutput device specifying the I10 device and I10 module; and
an Inputloutput command.
There are four types of I10 Commands, which can be classified as:
CONTROL commands are device specific and are used to control the specific instructions to the
device e.g., a magnetic tape requires rewinding or moving forward by a block. TEST command
Hardware checks the status such as, if a device is ready or not or is in error condition. The READ command is
Concepts used for input of data from input device and WRITE command is used for output of data to output
device.
The other part of I10 instruction is the address of the I10 device. In systems with programmed I10
the I10 module, the main memory and the CPU normally share the system bus. Thus, each I10
module should interpret the address lines to determine if the command is for itself. Or in other
words: How does CPU specify which device to access? There are two methods of doing so. These
are called memory mapped I10 and 110-mapped 110.
If we use the single address space for memory locations and I10 devices, i.e., the CPU treats the
status and data registers of I10 module as memory locations, and then memory and I10 devices can
be accessed using the same instructions. This is referred to as memory mapped I/O. For a memory
mapped 110 only a single READ and a single WRITE line are needed for memory or I10 module
read or write operations. These lines are activated by CPU for either memory access or 110 device
access. Figure 4 shows the memory mapped I10 system structure. This scheme is used in Motorola
68000.
Data bus
Address bus
READ line
WRITE line
U 0 Devices U O Device
In 110-mapped 110, the I/O devices and memory are addressed separately (Refer Figure 5). There
are separate control lines for memory and I10 device read or write operations, thus, a memory
reference instruction does not affect an I10 device. Here separate Input/ Output instructions are
needed which cause data transfer between addressed I10 module and CPU. This structure is used in
Intel 8085 & 8686 series.
.Input/Output
Data bus Organisation
T Address bus
YO Devices YO Device
I
4.3.2 Interrupt Driven InputIOutput
What is the basic drawback of programmed IIO? The speed of I10 devices is much slower in
comparison to that of CPU and because the CPU has to repeatedly check whether a device is free;
or wait till the completion of 110, therefore, the performance of CPU in programmed 110 goes down
tremendously. What is the solution? What about CPU going back to do other useful work without
waiting for the I10 device to complete or get freed up? But how will CPU be intimated about the
completion of I10 or a device is ready for IIO? A well-designed mechanism was conceived for this,
which is referred to as Interrupt driven 110. In this mechanism, provisions of interruption of CPU
work, once 110 device has finished the I10 or when it is ready for the 110, has been provided.
The interrupt driven I10 mechanism for transferring a block of data is shown in figure 3(b). Please
note that after issuing a READ command (for input) the CPU goes off to do other useful work (it
may be execution of a different program) while 110 module proceeds for reading of data from
associated device. At the completion of an instruction cycle (already discussed in Unit 1 of this
block) the CPU checks for interrupts (which will occur when data is in data register of 110 module
and it now needs CPU's attention).
Now CPU saves the important register and processor status of the executing program in a stack and
request I10 device to provide its data, which is placed on data bus by 110 device. After taking the
required action with the data. The CPU can go back to the program it was executing before the
interrupt.
Interrupt: As discussed in unit 1, the term interrupt loosely is usqd for any exceptional event that
causes temporary transfer of control of CPU from one program to the other which is causing the
interrupt. Interrupts are primarily issued on:
I
\
There are several solutions to these problems. The simplest of them is to provide multiple interrupt
lines, which will result in immediate recognition of the interrupting device. The priorities can be
assigned to various interrupts and the interrupt with highest priority should be selected for service
in case multiple interrupt occurs. But providing multiple interrupt lines is an impractical approach
because only a few lines of the system bus can be devoted for the interrupt. Other methods for this
are software poll, daisy chaining etc.
Software Poll: In this scheme on occurrence of an interrupt. CPU starts executing a software ;
-
routine termed as interrupt service program or routine which poll to each 110 module to determine
which 110 module has caused the interrupt. This may be achieved by reading the status register of
the 110 modules. The priority here can be implemented easily by defining the polling sequence,
since the device polled first will have higher priority. Please note that after identifying the device
i1
the next set of instructions to be executed will be the device service routinesof that device,
resulting in the desired input or output.
As far as daisy chaining is concerned, we have one Interrupt Acknowledge line, which is chained
through various interrupt devices. (The mechanism is similar, as we have discussed in Unit 2).
There is just one Interrupt Request line. On receiving an Interrupt Request the Interrupt
Acknowledge line is activated which in turn passes this signal device by device. The first device
which has made the interrupt request thus grasps the signal and responds by putting a word which is
normally an address of interrupt servicing program or a unique identifier on the data lines. This
word is also referred to as interrupt vector. This address or identifier in turn is used for selecting an
appropriate interrupt-servicing program. The daisy chaining has an in-built priority scheme, which
is determined by the sequence of devices on interrupt acknowledge line.
In bus arbitration technique, the 110 module first need to control the bus and only after that it can
request for an interrupt. In this scheme, since only one of the module can control the bus, therefore,
only one request can be made at a time. The interrupt request is acknowledged by the CPU on
response of which 110 module places the interrupt vector on the data lines. An interrupt vector
normally contains the address o ~ i n r k r uservipg
~t program.
You can refer to further readings for more details on some typical interrupt structures of interrupt
controllers.
When large amount of data is to be transferred from CPU, a DMA module can be used. But why3
In both Interrupt driven and programs 110 the CPU is tied up in executing inputloutput instructikvt\ -
while DMA acts as if it has taken over control from the CPU. The DMA operates in the f o n ~ ~ i n p
way:
Wl~enan 110 is requested, the CPU instructs the DMA module about the operation by providing the
information:
\vliicli operation (Read or Write) to be performed. Input/Output
Organisation
the address 01' 110 device \vliicli is to be used.
tlic starting location on tlie memory wliere tlie information will be read or written to the number
of words to be written or to be read.
.171ieDMA module transfers tlic reqilested block byte by byte directly to the memory without
intervening the CPU.
On completion of tlie reqilest DMA module sends an interrupt signal to the CPU.
Thus. in LIMA tlie CI'U involvenient can be restricted at tlie beginning and end of tlie transfer. But
\$hat do CPlJ do \vliile D M A is doing Input/Output, it lnay execute another program or may be
another part of the same program. Figure 6 shows registers of a general DMA module. Please note
tliat it contains additional registers for counting the data bytes and also note tliat address register
and data count registers are fed with the data lines.
I--
Address lines
Address Register 1
I
Control
lines
4
I Control Logic
Let LIS now see how this DMA mechanism works. Since, DMA module share the system Bus,
therefore, it needs to have some way to take control of the bus such that the data is transferred
to/from memory fro~nltotlie DMA module. A DMA module transfers an entire block of data or one
\sord at a time directly tolfrom Inemory. But when should tlie DMA take control of tlie bus?
For this we will recall the phenomena of execution of an instruction by tlie CPU. Figure 7 shows
tlie five cycles for an instruction execution. The figure also shows the five points wliere a DMA
request can be responded and a point where tlie interrupt request can be responded. Please note tliat
a n interrupt reqllest is acknowledged only at one point of an instruction cycle.
Alternatively, instead of transferring a complete block, a few words or a single word is transferred
through system bus to the memory through DMA. In this mode the DMA forces the CPU to
suspend its operation temporarily. After this transfer the controf is returned back to the CPU. This
technique is called cycle stealing. In this scheme although the rate of I10 by DMA goes down but
on the other hand it reduces the interference caused by the DMA to CPU operation.
It is possible to minimize this interference of CPU by DMA controller such that the cycles are
stolen only when CPU is not using system bus. This is termed as transparent DMA.
Finally let us discuss how DMA can be configured? Well, there are many ways we will discuss
some of them.
i
1
The simplest possibility is to allow DMA, L/O and all the modules to share the system bus. This
structure is shown in Figure 8(a). In this kind of configuration DMA may act as a supportive
processor and can use programmed I10 for exchanging data between memory and 110 module
through DMA module. But once again this spoils the basic advantage of DMA of not using extra
cycles for transferring information fiom memory tolfiom DMA and DMA fiomlto I10 module.
1
The Figure 8(b) configuration suggests clear-cut advantages over the one shown in figure 8(a). In I
these systems a path is provided between I10 module and DMA module, which does not include
system bus. The DMA logic may become part of an I10 module and can control one or more I10
i
module. In an extended concept an 110 bus can be connected to this DMA module. Such a i
configuration (shown in Figure 8(c)) is quite flexible and can be extended very easily. In both these
configurations the added advantage is that the data between I10 module and DMA module is
transferred off the system bus. Thus, eliminating the disadvantage we have witnessed for the first
configuration.
"(%j!j~l
Module 1 Module Module 3
- (c)
Figure 8: Some of the DMA Configurations
InputfOutput
4.5 INPUTIOUTPUT PROCESSORS Organisation
Before discussing about I10 Processors, let us briefly recapitulate the development in the area of
Input/output functions. These call be summarised as:
Step I : Direct control of CPU on [I0 device. Limited number of I10 devices.
Step 2: Introduction of I10 controller or 110 module. Use of programmed 110. CPU was separated
from the details of external I t 0 interfaces.
Step 3: Contained use of I/O controllers but with interrupts. CPU need not wait for 110 operation to
complete (increased efficiency).
Step 4: Direct access of 110 module to the memory via DMA. CPU involvement reduced to at the
beginning and at the end of DMA operation.
I
The concept of 110 processor is an extension of the concept of DMA. The 110 processor can execute
1 specialised I10 program residing in the memory without intervention of the CPU. Thus, CPU only
needs to specify a sequence of 110 activity to I10 processor. The 110 processor then executes the
I necessary 110 instructions, i h i c h are required for the task; and interrupt the CPU only after the
entire sequence of 110 activity as specified by CPU have been completed. An advanced 110
processor can have its own memory, enabling a large set of I t 0 devices to be controlled without
much involvement from the CPU. Thus, an I10 processor has the additional ability to execute I10
instructions, which provides a complete control on 110 operations. Thus, I10 processor is much
more powerful than DMA whicli provides only a limited control of I10 device. For example, if an
110 device is busy then DMA will only interrupt the CPU and will inform the CPU again when the
device is free while I10 processor responsibility will be to keep on checking the status of the I10
device and once it has found to be free go ahead with I/Q and when I10 finishes, communicate it to
the CPU. The communication between I10 processor and CPU can be achieved by writing message
in the memory area shared by the two processors. CPU instructs I10 processor to execute an I10
program in the memory. The prograin will then specify the device or devices and the area of the
memory where the I10 data is stored or to be stored. In addition, this program also contains the
actions that are to be taken in case of errors. or what priority is to be given to various 110 devices.
'111computer system, which has IOPs the CPU normally, do not execute 110 data transfer instruction.
110 instructions are stored in memory and are executed by [OPs. The IOP can be provided with the
direct access to tlie memory and can control the system bus. An IOP can execute a sequence of data
transfer instructions involving different memory regions and different devices without intervention
of tlie CPU. The 110 processor is termed as channel in IBM machines.
Later on several other computers use the term channel also. The earlier channels did not have any
memory but the present channels may have large cache memory, which may be used for data
buffering. The Control Data Corporation (CDC) computers and some other computers use relatively
sophisticated I10 systems. These are called Peripheral processing units (PPUs). These PPUs also
perform the job of I10 processor PPUs in itself are complete, simple computers with their own
memory. In addition to I10 they are capable of performiiig someladditional data manipulations,
which include data formatting, character translation and buffering.
Let us now discuss two common types of I10 channels. For high-speed devices a selector channel is
used. This channel can transfer data from one high-speed device at a time. 110 modules can in t u r n
handle each of these liigli-speed devices. Thus, we effectively have an 110 processor taking tile
place of CPU in controlling various I10 modules.
Control Si nal
frodto C ~ U I VO
Module1
controller
/ 1 110
Modulel
controller
1
6 423 6
High speed VO Devices High speed Dcvices
4
Control Si nal
frodto cbu
The second type of channel is multiplexer channel, which can handle inputfoutput with a number of
devices at the same time. If the devices are slow then byte multiplexing is used. Let us explain this
with an example. If we have three slow devices which need to send individual bytes as:
Then on a byte multiplexer channel they may sent the bytes as B1 Y I TI B2 Y2 T2 B3 Y3 T3 ...... For
high-speed device blocks of data from several devices is interleaved. These are called block
~nilltiplexerchannels.
W c are not including example of an tnputYQutput processor here but you are advised to look into
these examples after you have completed first two blocks, from the further readings.
Our discussion on 110 system will not be complete if we do not discuss about external interfaces.
The cxternal interface i g the interface between the 110 module and the peripheral devices. This
inicrfrice can be characterised into two main categories: (a) parallel interface, and (b) serial
interface,
In parallel interface multiple bits can he transferred si~nultaneously.I'lie parailel interface is lllpllt/ollt~llll
Org;111is>11io11
normally used for Iiigli-speed peripherals such as tapes ant1 disks. l'he dialogues that take place
across the interface include the exchange of control information and data. A common parallel
interface is centronics.
In serial interface only one line is used to transmit data, therefore, only one bit is transferred at a
time. Serial interface is used for serial printers and terminals. Tlie most colnrnoll serial interface is
RS-232C. A new standard, which is becoming very popular and multiple devises to be connected,
is Universal Synchronous Bus (USR). It is an industrial standard today.
Irrespective of the type of interface the 110 module has to communicate with the peripheral in the
following way for a read or write operation.
A control signal is sent by I10 module to the peripheral requesting the permission to send (for
write operation) or receive (for read operation) data.
The data is transferred from I10 module to peripheral (for write) or from peripheral to 110
module (for read).
Both serial and parallel transmission can be in two modes synchronous and asynchronous. In
synchronous mode several characters are transmitted in a single transmission while in asynchronous
mode only few bits are transmitted at a time.
1. 110 mapped 110 scheme require no additional line from CPU to 110 T~~~ False
device except for system bus.
2. Memory mapped 110 scheme uses a separate instruction for data True False
transfer fromlto memory; and fromlto I10 module.
3. Tlie advantages of interrupt driven 110 over programmed 110 is that True False
in interrupt driven I10 the interrupt mechanisms free I10 devices
quickly.
4. In the Transparent DMA the cycles are stolen only\ah& CPO idnbt True False
using the bus.
5. Most of the I10 processors have its own memory while a DMA T~~~ False
module does not have its own memory except for register or a
simple buffer area.
6. Parallel interfaces are commonly used for connecting printers to a T,,
computer.
4.7' SUMMARY
This unit is the last unit of this block. In this block we have covered the three major points, which
include 110 devices, memory and Interconnection structure. The last of the components i.e., the
I CI'U will be discussed in block 2. This unit was devoted mainly towards 110 of coinputer system
we have discussed about the 110 module, 110 techniques, external I10 interfaces etc. The d c \ i i p
details of these have not been covered in this unit. For details on the design aspect you can refers to
the further readings.
1. (a) False
(b) False
(c) True
2. A device controller is an I10 module, which interacts with the I10 devices as per the
instructions, provided to it by the CPU.
1 . False
2. False
3. False
4. True
5. True
6. True