Computer Organization
Computer Organization
ORGANIZATION
CLICK TO VIEW
>>Syllabus
>>Solved Questions
CREDITS 04
Course objectives:
This course will enable students to
Understand the basics of computer organization: structure and operation of computers and their
peripherals.
Understand the concepts of programs as sequences or machine instructions.
Expose different ways of communicating with I/O devices and standard I/O interfaces.
Describe hierarchical memory systems including cache memories and virtual memory.
Describe arithmetic and logical operations with integer and floating-point operands.
Understand basic processing unit and organization of simple processor, concept of pipelining and
other large computing systems.
Module -1 Teaching
Hours
Memory System: Basic Concepts, Semiconductor RAM Memories, Read Only Memories,
Speed, Size, and Cost, Cache Memories Mapping Functions, Replacement Algorithms,
10 Hours
Performance Considerations, Virtual Memories, Secondary Storage.
Textbook 1: Ch 5: 5.1 to 5.4, 5.5.1, 5.5.2, 5.6, 5.7, 5.9
Module-4
7|Page
Arithmetic: Numbers, Arithmetic Operations and Characters, Addition and Subtraction of
Signed Numbers, Design of Fast Adders, Multiplication of Positive Numbers, Signed
10 Hours
Operand Multiplication, Fast Multiplication, Integer Division, Floating-point Numbers and
Operations.
Textbook 1: Ch 2: 2.1, Ch 6: 6.1 to 6.7
Module-5
Text Books:
1. Carl Hamacher, Zvonko Vranesic, Safwat Zaky: Computer Organization, 5th Edition, Tata McGraw Hill,
2002.
Reference Books:
1. William Stallings: Computer Organization & Architecture, 9th Edition, Pearson, 2015.
8|Page
Computer Organization 15CS34
There are two ways that byte addresses can be assigned across words, as shown in fig b.
The name big-endian is used when lower byte addresses are used for the more significant bytes
(the leftmost bytes) of the word. The name little-endian is used for the opposite ordering, where
the lower byte addresses are used for the less significant bytes (the rightmost bytes) of the word.
In addition to specifying the address ordering of bytes within a word, it is also necessary
to specify the labeling of bits within a byte or a word. The same ordering is also used for labeling
bits within a byte, that is, b7, b6, ., b0, from left to right.
WORD ALIGNMENT:-
In the case of a 32-bit word length, natural word boundaries occur at addresses 0, 4, 8, ,
as shown in above fig. We say that the word locations have aligned addresses . in general, words
are said to be aligned in memory if they begin at a byte address that is a multiple of the number of
bytes in a word. The memory of bytes in a word is a power of 2. Hence, if the word length is 16
(2 bytes), aligned words begin at byte addresses 0,2,4,, and for a word length of 64 (23 bytes),
aligned words begin at bytes addresses 0,8,16 .
2. Explain the different functional units of a computer with a neat block diagram.
Dec2015,Jan 2014
A computer consists of five functionally independent main parts input, memory, arithmetic
logic unit (ALU), output and control unit.
Input ALU
Memory
Output
I/O Processor
Control Unit
Input device accepts the coded information as source program i.e. high level language.
This is either stored in the memory or immediately used by the processor to perform the desired
operations. The program stored in the memory determines the processing steps. Basically the
computer converts one source program to an object program. i.e. into machine language. Finally
the results are sent to the outside world through output device. All of these actions are coordinated
by the control unit.
Input unit: -
Memory unit: -
Its function into store programs and data. It is basically to two types
Primary memory
Secondary memory
1. Primary memory: - Is the one exclusively associated with the processor and operates at the
electronics speeds programs must be stored in this memory while they are being executed. The
memory contains a large number of semiconductors storage cells. Each capable of storing one bit
of information. These are processed in a group of fixed site called word. To provide easy
access to a word in memory, a distinct address is associated with each word location. Addresses
are numbers that identify memory location. Number of bits in each word is called word length of
the computer. Programs must reside in the memory during execution. Instructions and data can be
written into the memory or read out under the control of processor. Memory in which any location
can be reached in a short and fixed amount of time after specifying its address is called random-
access memory (RAM). The time required to access one word in called memory access time.
Memory which is only readable by the user and contents of which cant be altered is called read
only memory (ROM) it contains operating system.
2 Secondary memory: - Is used where large amounts of data & programs have to be stored,
particularly information that is accessed infrequently. Examples: - Magnetic disks & tapes, optical
disks (ie CD-ROMs), floppies etc.,
Most of the computer operators are executed in ALU of the processor like addition,
subtraction, division, multiplication, etc. the operands are brought into the ALU from memory and
stored in high speed storage elements called register. Then according to the instructions the
operation is performed in the required sequence. The control and the ALU are may times
faster than other devices connected to a computer system. This enables a single processor to control
a number of external devices such as key boards, displays, magnetic and optical disks, sensors and
other mechanical controllers.
Output unit:-
These actually are the counterparts of input unit. Its basic function is to send the processed
results to the outside world.Examples:- Printer, speakers, monitor etc.
Control unit:-
It effectively is the nerve center that sends signals to other units and senses their states.
The actual timing signals that govern the transfer of data between input unit, processor, memory
and output unit are generated by the control unit.
3. Write the basic performance equation. explain the role of each of the parameters in the
equation on the performance of the computer. (8M) July 2014
We now focus our attention on the processor time component of the total elapsed time.
Let T be the processor time required to execute a program that has been prepared in some high-
level language. The compiler generates a machine language object program that corresponds to
the source program. Assume that complete execution of the program requires the execution of N
machine cycle language instructions. The number N is the actual number of instruction execution
and is not necessarily equal to the number of machine cycle instructions in the object program.
Some instruction may be executed more than once, which in the case for instructions inside a
program loop others may not be executed all, depending on the input data used.
Suppose that the average number of basic steps needed to execute one machine cycle
instruction is S, where each basic step is completed in one clock cycle. If clock rate is R cycles
per second, the program execution time is given by
N S
T this is often referred to as the basic performance equation.
R
We must emphasize that N, S & R are not independent parameters changing one may
affect another. Introducing a new feature in the design of a processor will lead to improved
performance only if the overall result is to reduce the value of T.
4. With a neat diagram ,discuss the basic operational concepts of a computer? Dec 2015
Basic Operational Concepts
Instructions take a vital role for the proper working of the computer.
An appropriate program consisting of a list of instructions is stored in the memory so that
the tasks can be started.
The memory brings the Individual instructions into the processor, which executes the
specified operations.
Data which is to be used as operands are moreover also stored in the memory.
Example:
Add LOCA, R0
This instruction adds the operand at memory location LOCA to the operand
which will be present in the Register R0.
The above mentioned example can be written as follows:
5. List the different systems used to represent a signed number. July 2015
2) 1s complement
3) 2s complement
Ans.Addressing modes: In general, a program operates on data that reside in the computers
memory. These data can be organized in a variety of ways. If we want to keep track of students
names, we can write them in a list. Programmers use organizations called data structures to
represent the data used in computations. These include lists, linked lists, arrays, queues, and so on.
Programs are normally written in a high-level language, which enables the programmer to
use constants, local and global variables, pointers, and arrays. The different ways in which the
location of an operand is specified in an instruction are referred to as addressing modes.
Register Ri EA = Ri
(LOC) EA = [LOC]
and offset
EA = effective address
8.Explain the shift and rotate operation with examples (8M) June 2015
There are many applications that require the bits of an operand to be shifted right or left
some specified number of bit positions. The details of how the shifts are performed depend on
whether the operand is a signed number or some more general binary-coded information. For
general operands, we use a logical shift. For a number, we use an arithmetic shift, which preserves
the sign of the number.
Logical shifts:-
Two logical shift instructions are needed, one for shifting left (LShiftL) and another for
shifting right (LShiftR). These instructions shift an operand over a number of bit positions
specified in a count operand contained in the instruction. The general form of a logical left shift
instruction is
R0
0
0 0 1 1 1 0 . . . 0 1 1
C
before :
1 1 1 0 . . . 0 1 1 00
after:
R0 C
0 1 1 1 0 . . . 0 1 1 0
Before:
1
After: 0 0 0 1 1 1 0 . . . 0
R0 C
Before:
1 0 0 1 1 . . . 0 1 0 0
After: 1 1 1 0 0 1 1 . . . 0 1
Rotate Operations:-
In the shift operations, the bits shifted out of the operand are lost, except for the last bit
shifted out which is retained in the Carry flag C. To preserve all bits, a set of rotate instructions
can be used. They move the bits that are shifted out of one end of the operand back into the other
end. Two versions of both the left and right rotate instructions are usually provided. In one version,
the bits of the operand are simply rotated. In the other version, the rotation includes the C flag.
9. What is the need for an addressing mode? Explain the following addressing modes with
examples: immediate, direct, indirect, index, relative (8M) Dec 2014 /July 2015
Move 200immediate, R0 Places the value 200 in register R0. Clearly, the
Immediate mode is only used to specify the value of a source operand. Using a subscript to denote
the Immediate mode is not appropriate in assembly languages. A common convention is to use the
sharp sign (#) in front of the value to indicate that this value is to be used as an immediate operand.
Hence, we write the instruction above in the form
Move #200, R0
Direct mode:
Register mode - The operand is the contents of a processor register; the name (address) of the
register is given in the instruction.
Absolute mode The operand is in a memory location; the address of this location is given
explicitly in the instruction. (In some assembly languages, this mode is called Direct).
The instruction
Move LOC, R2
Processor registers are used as temporary storage locations where the data is a register are
accessed using the Register mode. The Absolute mode can represent global variables in a program.
A declaration such as
Integer A, B;
10. a)What is subroutine linkage?how are parameters passed to subroutines(8M) June 2016
After a subroutine has been executed, the calling program must resume execution,
continuing immediately after the instruction that called the subroutine. The subroutine is said to
return to the program that called it by executing a Return instruction.
The way in which a computer makes it possible to call and return from subroutines is
referred to as its subroutine linkage method. The simplest subroutine linkage method is to save the
return address in a specific location, which may be a register dedicated to this function. Such a
register is called the link register. When the subroutine completes its task, the Return instruction
returns to the calling program by branching indirectly through the link register.
The Call instruction is just a special branch instruction that performs the following operations
Memory Memory
1000
204
PC
Link
204
Call Return
Now, observe how space is used in the stack in the example. During execution of the
subroutine, six locations at the top of the stack contain entries that are needed by the subroutine.
These locations constitute a private workspace for the subroutine, created at the time the subroutine
is entered and freed up when the subroutine returns control to the calling program. Such space is
called a stack frame.
Localvar1 FP called
Return address
Param1
Old TOS
Param2
Dept Of CSE, SJBIT Page 11
Param3
Param4
Computer Organization 15CS34
fig b shows an example of a commonly used layout for information in a stack frame. In
addition to the stack pointer SP, it is useful to have another pointer register, called the frame pointer
(FP), for convenient access to the parameters passed to the subroutine and to the local memory
variables used by the subroutine. These local variables are only used within the subroutine, so it is
appropriate to allocate space for them in the stack frame associated with the subroutine. We assume
that four parameters are passed to the subroutine, three local variables are used within the
subroutine, and registers R0 and R1 need to be saved because they will also be used within the
subroutine.
The pointers SP and FP are manipulated as the stack frame is built, used, and dismantled
for a particular of the subroutine. We begin by assuming that SP point to the old top-of-stack (TOS)
element in fig b. Before the subroutine is called, the calling program pushes the four parameters
onto the stack. The call instruction is then executed, resulting in the return address being pushed
onto the stack. Now, SP points to this return address, and the first instruction of the subroutine is
about to be executed. This is the point at which the frame pointer FP is set to contain the proper
memory address. Since FP is usually a general-purpose register, it may contain information of use
to the Calling program. Therefore, its contents are saved by pushing them onto the stack. Since the
SP now points to this position, its contents are copied into FP.
Move SP, FP
After these instructions are executed, both SP and FP point to the saved FP contents.
Subtract #12, SP
Finally, the contents of processor registers R0 and R1 are saved by pushing them onto the
stack. At this point, the stack frame has been set up.
The subroutine now executes its task. When the task is completed, the subroutine pops the
saved values of R1 and R0 back into those registers, removes the local variables from the stack
frame by executing the instruction.
Add #12, SP
And pops the saved old value of FP back into FP. At this point, SP points to the return
address, so the Return instruction can be executed, transferring control back to the calling program.
Computer instructions are the basic components of a machine language program. They are also
known as macrooperations, since each one is comprised of a sequences of microoperations.
Each instruction initiates a sequence of microoperations that fetch operands from registers or
memory, possibly perform arithmetic, logic, or shift operations, and store results in registers or
memory.
Instructions are encoded as binary instruction codes. Each instruction code contains of
a operation code, or opcode, which designates the overall purpose of the instruction (e.g. add,
subtract, move, input, etc.). The number of bits allocated for the opcode determined how many
different instructions the architecture supports.
In addition to the opcode, many instructions also contain one or more operands, which indicate
where in registers or memory the data required for the operation is located. For example,
and add instruction requires two operands, and a not instruction requires one.
15 12 11 6 5 0
+-----------------------------------+
| Opcode | Operand | Operand |
+-----------------------------------+
The opcode and operands are most often encoded as unsigned binary numbers in order to
minimize the number of bits used to store them. For example, a 4-bit opcode encoded as a binary
number could represent up to 16 different operations.
The control unit is responsible for decoding the opcode and operand bits in the instruction
register, and then generating the control signals necessary to drive all other hardware in the CPU
to perform the sequence of microoperations that comprise the instruction.
12. Big-Endian and Little-Endian assignment,explain with necessary figure. Jan 2014
Big Endian
In big endian, you store the most significant byte in the smallest address. Here's how it would
look:
Address Value
1000 90
1001 AB
1002 12
1003 CD
Little Endian
In little endian, you store the least significant byte in the smallest address. Here's how it would
look:
Address Value
1000 CD
1001 12
1002 AB
1003 90
Notice that this is in the reverse order compared to big endian. To remember which is which,
recall whether the least significant byte is stored first (thus, little endian) or the most significant
byte is stored first (thus, big endian).
Notice I used "byte" instead of "bit" in least significant bit. I sometimes abbreciated this as LSB
and MSB, with the 'B' capitalized to refer to byte and use the lowercase 'b' to represent bit. I only
refer to most and least significant byte when it comes to endianness.
13. Explain logical and arithmetic shift instructions with an example. July 2016
Ans: The restriction that an instruction must occupy only one word has led to a style of
computers that have become known as reduced instruction set computer (RISC). The
RISC approach introduced other restrictions, such as that all manipulation of data must be
done on operands that are already in processor registers. This restriction means that the
above addition would need a two-instruction sequence
Move (R3), R1
Add R1, R2
If the Add instruction only has to specify the two registers, it will
need just a portion of a 32-bit word. So, we may provide a more powerful
instruction that uses three operands
R3 [R1] + [R2]
i) interrupt ii)vectored interrupt ii)interrupt nesting iv)an exception and give two
examples (6M) Dec 2015
Ans. Interrupt
We pointed out that an I/O device requests an interrupt by activating a bus line called
interrupt-request. Most computers are likely to have several I/O devices that can request an
interrupt. A single interrupt-request line may be used to serve n devices as depicted. All devices
are connected to the line via switches to ground. To request an interrupt, a device closes its
associated switch. Thus, if all interrupt-request signals INTR1 to INTRn are inactive, that is, if all
switches are open, the voltage on the interrupt-request line will be equal to Vdd. This is the inactive
state of the line. Since the closing of one or more switches will cause the line voltage to drop to 0,
the value of INTR is the logical OR of the requests from individual devices, that is,
It is customary to use the complemented form, INTR , to name the interrupt-request signal on the
common line, because this signal is active when in the low-voltage state.
Vectored Interrupts:-
To reduce the time involved in the polling process, a device requesting an interrupt may
identify itself directly to the processor. Then, the processor can immediately start executing the
corresponding interrupt-service routine. The term vectored interrupts refers to all interrupt-
handling schemes based on this approach.
A device requesting an interrupt can identify itself by sending a special code to the
processor over the bus. This enables the processor to identify individual devices even if they share
a single interrupt-request line. The code supplied by the device may represent the starting address
of the interrupt-service routine for that device. The code length is typically in the range of 4 to 8
bits. The remainder of the address is supplied by the processor based on the area in its memory
where the addresses for interrupt-service routines are located.
This arrangement implies that the interrupt-service routine for a given device must always
start at the same location. The programmer can gain some flexibility by storing in this location an
instruction that causes a branch to the appropriate routine.
Interrupt Nesting: -
is often used when several devices are involved, in which case execution of a given interrupt-
service routine, once started, always continues to completion before the processor accepts an
interrupt request from a second device. Interrupt-service routines are typically short, and the delay
they may cause is acceptable for most simple devices.
For some devices, however, a long delay in responding to an interrupt request may lead to
erroneous operation. Consider, for example, a computer that keeps track of the time of day using
a real-time clock. This is a device that sends interrupt requests to the processor at regular intervals.
For each of these requests, the processor executes a short interrupt-service routine to increment a
set of counters in the memory that keep track of time in seconds, minutes, and so on. Proper
operation requires that the delay in responding to an interrupt request from the real-time clock be
small in comparison with the interval between two successive requests. To ensure that this
requirement is satisfied in the presence of other interrupting devices, it may be necessary to accept
an interrupt request from the clock during the execution of an interrupt-service routine for another
device.
This example suggests that I/O devices should be organized in a priority structure. An
interrupt request from a high-priority device should be accepted while the processor is servicing
another request from a lower-priority device.
The processors priority is usually encoded in a few bits of the processor status word. It
can be changed by program instructions that write into the PS. These are privileged instructions,
which can be executed only while the processor is running in the supervisor mode. The processor
is in the supervisor mode only when executing operating system routines. It switches to the user
mode before beginning to execute application programs. Thus, a user program cannot accidentally,
or intentionally, change the priority of the processor and disrupt the systems operation. An attempt
to execute a privileged instruction while in the user mode leads to a special type of interrupt called
a privileged instruction.
2.Explain in with the help of a diagram the working of daisy chain with multiple priority
levels and multiple devices in each level (8M) Dec 2014,Jan2015
Simultaneous Requests:
Let us now consider the problem of simultaneous arrivals of interrupt requests from two
or more devices. The processor must have some means of deciding which requests to service first.
Using a priority scheme such as that of figure, the solution is straightforward. The processor simply
accepts the requests having the highest priority.
Polling the status registers of the I/O devices is the simplest such mechanism. In this case,
priority is determined by the order in which the devices are polled. When vectored interrupts are
used, we must ensure that only one device is selected to send its interrupt vector code. A widely
used scheme is to connect the devices to form a daisy chain, as shown in figure 3a. The interrupt-
request line INTR is common to all devices. The interrupt-acknowledge line, INTA, is connected
in a daisy-chain fashion, such that the INTA signal propagates serially through the devices.
INTR
Processor
When several devices raise an interrupt request and the INTR line is activated, the
processor responds by setting the INTA line to 1. This signal is received by device 1. Device 1
passes the signal on to device 2 only if it does not require any service. If device 1 has a pending
request for interrupt, it blocks the INTA signal and proceeds to put its identifying code on the data
lines. Therefore, in the daisy-chain arrangement, the device that is electrically closest to the
processor has the highest priority. The second device along the chain has second highest priority,
and so on.
The scheme in figure 3.a requires considerably fewer wires than the individual connections
in figure 2. The main advantage of the scheme in figure 2 is that it allows the processor to accept
interrupt requests from some devices but not from others, depending upon their priorities. The two
schemes may be combined to produce the more general structure in figure 3b. Devices are
organized in groups, and each group is connected at a different priority level. Within a group,
devices are connected in a daisy chain. This organization is used in many computer systems.
3. Discuss the different schemes available to disable and enable the interrupts (8M)
The facilities provided in a computer must give the programmer complete control over the
events that take place during program execution. The arrival of an interrupt request from an
external device causes the processor to suspend the execution of one program and start the
execution of another. Because interrupts can arrive at any time, they may alter the sequence of
events from the envisaged by the programmer. Hence, the interruption of program execution must
be carefully controlled.
Let us consider in detail the specific case of a single interrupt request from one device.
When a device activates the interrupt-request signal, it keeps this signal activated until it learns
that the processor has accepted its request. This means that the interrupt-request signal will be
active during execution of the interrupt-service routine, perhaps until an instruction is reached that
accesses the device in question.
The first possibility is to have the processor hardware ignore the interrupt-request line until
the execution of the first instruction of the interrupt-service routine has been completed. Then, by
using an Interrupt-disable instruction as the first instruction in the interrupt-service routine, the
programmer can ensure that no further interruptions will occur until an Interrupt-enable instruction
is executed. Typically, the Interrupt-enable instruction will be the last instruction in the interrupt-
service routine before the Return-from-interrupt instruction. The processor must guarantee that
execution of the Return-from-interrupt instruction is completed before further interruption can
occur.
The second option, which is suitable for a simple processor with only one interrupt-request
line, is to have the processor automatically disable interrupts before starting the execution of the
interrupt-service routine. After saving the contents of the PC and the processor status register (PS)
on the stack, the processor performs the equivalent of executing an Interrupt-disable instruction. It
is often the case that one bit in the PS register, called Interrupt-enable, indicates whether interrupts
are enabled.
In the third option, the processor has a special interrupt-request line for which the
interrupt-handling circuit responds only to the leading edge of the signal. Such a line is said to be
edge-triggered.Before proceeding to study more complex aspects of interrupts, let us summarize
the sequence of events involved in handling an interrupt request from a single device. Assuming
that interrupts are enabled, the following is a typical scenario.
3. Interrupts are disabled by changing the control bits in the PS (except in the case of edge-
triggered interrupts).
4. The device is informed that its request has been recognized, and in response, it deactivates
the interrupt-request signal.
5. The action requested by the interrupt is performed by the interrupt-service routine.
6. Interrupts are enabled and execution of the interrupted program is resumed.
4. Discuss the different schemes available to disable and enable the interrupts (8M) Dec 2015
The facilities provided in a computer must give the programmer complete control over the
events that take place during program execution. The arrival of an interrupt request from an
external device causes the processor to suspend the execution of one program and start the
execution of another. Because interrupts can arrive at any time, they may alter the sequence of
events from the envisaged by the programmer. Hence, the interruption of program execution must
be carefully controlled.
Let us consider in detail the specific case of a single interrupt request from one device.
When a device activates the interrupt-request signal, it keeps this signal activated until it learns
that the processor has accepted its request. This means that the interrupt-request signal will be
active during execution of the interrupt-service routine, perhaps until an instruction is reached that
accesses the device in question.
The first possibility is to have the processor hardware ignore the interrupt-request line until
the execution of the first instruction of the interrupt-service routine has been completed. Then, by
using an Interrupt-disable instruction as the first instruction in the interrupt-service routine, the
programmer can ensure that no further interruptions will occur until an Interrupt-enable instruction
is executed. Typically, the Interrupt-enable instruction will be the last instruction in the interrupt-
service routine before the Return-from-interrupt instruction. The processor must guarantee that
execution of the Return-from-interrupt instruction is completed before further interruption can
occur.
The second option, which is suitable for a simple processor with only one interrupt-request
line, is to have the processor automatically disable interrupts before starting the execution of the
interrupt-service routine. After saving the contents of the PC and the processor status register (PS)
on the stack, the processor performs the equivalent of executing an Interrupt-disable instruction. It
is often the case that one bit in the PS register, called Interrupt-enable, indicates whether interrupts
are enabled.
In the third option, the processor has a special interrupt-request line for which the
interrupt-handling circuit responds only to the leading edge of the signal. Such a line is said to be
edge-triggered.Before proceeding to study more complex aspects of interrupts, let us summarize
the sequence of events involved in handling an interrupt request from a single device. Assuming
that interrupts are enabled, the following is a typical scenario.
Centralized Arbitration:- The bus arbiter may be the processor or a separate unit connected to
the bus. A basic arrangement in which the processor contains the bus arbitration circuitry. In this
case, the processor is normally the bus master unless it grants bus mastership to one of the DMA
controllers. A DMA controller indicates that it needs to become the bus master by activating the
Bus-Request line,The signal on the Bus-Request line is the logical OR of the bus requests from all
the devices connected to it. When Bus-Request is activated, the processor activates the Bus-Grant
signal, BG1, indicating to the DMA controllers that they may use the bus when it becomes free.
This signal is connected to all DMA controllers using a daisy-chain arrangement. Thus, if DMA
controller 1 is requesting the bus, it blocks the propagation of the grant signal to other devices.
Otherwise, it passes the grant downstream by asserting BG2. The current bus master indicates to
all device that it is using the bus by activating another open-controller line called Bus-Busy,
BBSY . Hence, after receiving the Bus-Grant signal, a DMA controller waits for Bus-Busy to
become inactive, then assumes mastership of the bus. At this time, it activates Bus-Busy to prevent
other devices from using the bus at the same time.
6.Draw the arrangement of a single bus structure and brief about memory mapped
I/O. Jan 2014
Ans: a central control unit and arithmetic logic unit (ALU, which he called the central arithmetic
part) were combined with computer memory and input and output functions to form a stored
program computer.[1] The Report presented a general organization and theoretical model of the
computer, however, not the implementation of that model.[2] Soon designs integrated the control
unit and ALU into what became known as the central processing unit (CPU).
Computers in the 1950s and 1960s were generally constructed in an ad-hoc fashion. For
example, the CPU, memory, and input/output units were each one or more cabinets connected by
cables. Engineers used the common techniques of standardized bundles of wires and extended
the concept as backplanes were used to hold printed circuit boards in these early machines.
7. Explain i)interrupt enabling ii)edge triggering with respect interrupts. Jan 2015
Ans: IF (Interrupt Flag) is a system flag bit in the x86 architecture's FLAGS register, which
determines whether or not the CPU will handle maskable hardware interrupts.
The bit, which is bit 9 of the FLAGS register, may be set or cleared by programs with sufficient
privileges, as usually determined by the Operating System. If the flag is set to 1, maskable
hardware interrupts will be handled. If cleared (set to 0), such interrupts will be ignored. IF does
not affect the handling of non-maskable interrupts or software interrupts generated by
the INTinstruction.
8. Explain with a block diagram a general 8 bit parallel interface.(8M) July 2015
Ans. My-address
RS2
RS1
Register Status
RS0 C1
select and
Ready
Control
R/W
C2
The circuit in figure 16 has separate input and output data lines for connection to an I/O device. A
more flexible parallel port is created if the data lines to I/O devices are bidirectional. Figure 17
shows a general-purpose parallel interface circuit that can be configured in a variety of ways. Data
lines P7 through P0 can be used for either input or output purposes. For increased flexibility, the
circuit makes it possible for some lines to serve as inputs and some lines to serve as outputs, under
program control. The DATAOUT register is connected to these lines via three-state drivers that
are controlled by a data direction register, DDR. The processor can write any 8-bit pattern into
DDR. For a given bit, if the DDR value is 1, the corresponding data line acts as an output line;
otherwise, it acts as an input line.
9.With the help of a data transfer signals explain how a real operation is performed using
PCI bus. (8M) Dec 2015
The PCI bus is a good example of a system bus that grew out of the need for
standardization. It supports the functions found on a processor bus bit in a standardized format that
is independent of any particular processor. Devices connected to the PCI bus appear to the
processor as if they were connected directly to the processor bus. They are assigned addresses in
the memory address space of the processor.
The PCI follows a sequence of bus standards that were used primarily in IBM PCs. Early
PCs used the 8-bit XT bus, whose signals closely mimicked those of Intels 80x86 processors.
Later, the 16-bit bus used on the PC At computers became known as the ISA bus. Its extended 32-
bit version is known as the EISA bus. Other buses developed in the eighties with similar
capabilities are the Microchannel used in IBM PCs and the NuBus used in Macintosh computers.
The PCI was developed as a low-cost bus that is truly processor independent. Its design
anticipated a rapidly growing demand for bus bandwidth to support high-speed disks and graphic
and video devices, as well as the specialized needs of multiprocessor systems. As a result, the PCI
is still popular as an industry standard almost a decade after it was first introduced in 1992.
An important feature that the PCI pioneered is a plug-and-play capability for connecting I/O
devices. To connect a new device, the user simply connects the device interface board to the bus.
The software takes care of the rest.
Data Transfer:-
In todays computers, most memory transfers involve a burst of data rather than just one
word. The reason is that modern processors include a cache memory. Data are transferred between
the cache and the main memory in burst of several words each. The words involved in such a
transfer are stored at successive memory locations. When the processor (actually the cache
controller) specifies an address and requests a read operation from the main memory, the memory
responds by sending a sequence of data words starting at that address. Similarly, during a write
operation, the processor sends a memory address followed by a sequence of data words, to be
written in successive memory locations starting at the address. The PCI is designed primarily to
support this mode of operation. A read or write operation involving a single word is simply treated
as a burst of length one.
The bus supports three independent address spaces: memory, I/O, and configuration. The
first two are self-explanatory. The I/O address space is intended for use with processors, such as
Pentium, that have a separate I/O address space. However, as noted , the system designer may
choose to use memory-mapped I/O even when a separate I/O address space is available. In fact,
this is the approach recommended by the PCI its plug-and-play capability. A 4-bit command that
accompanies the address identifies which of the three spaces is being used in a given data transfer
operation.
The signaling convention on the PCI bus is similar to the one used, we assumed that the
master maintains the address information on the bus until data transfer is completed. But, this is
not necessary. The address is needed only long enough for the slave to be selected. The slave can
store the address in its internal buffer. Thus, the address is needed on the bus for one clock cycle
only, freeing the address lines to be used for sending data in subsequent clock cycles. The result is
a significant cost reduction because the number of wires on a bus is an important cost factor. This
approach in used in the PCI bus.
At any given time, one device is the bus master. It has the right to initiate data transfers by
issuing read and write commands. A master is called an initiator in PCI terminology. This is other
a processor or a DMA controller. The addressed device that responds to read and write commands
is called a target.
Device Configuration:-
When an I/O device is connected to a computer, several actions are needed to configure
both the device and the software that communicates with it.
The PCI simplifies this process by incorporating in each I/O device interface a small
configuration ROM memory that stores information about that device. The configuration ROMs
of all devices is accessible in the configuration address space. The PCI initialization software reads
these ROMs whenever the system is powered up or reset. In each case, it determines whether the
device is a printer, a keyboard, an Ethernet interface, or a disk controller. It can further learn bout
various device options and characteristics.
Devices are assigned addresses during the initialization process. This means that during
the bus configuration operation, devices cannot be accessed based on their address, as they have
not yet been assigned one. Hence, the configuration address space uses a different mechanism.
Each device has an input signal called Initialization Device Select, IDSEL#.
The PCI bus has gained great popularity in the PC word. It is also used in many other
computers, such as SUNs, to benefit from the wide range of I/O devices for which a PCI interface
is available. In the case of some processors, such as the Compaq Alpha, the PCI-processor bridge
circuit is built on the processor chip itself, further simplifying system design and packaging.
10. Explain briefly bus arbitration phase in SCSI bus.(8M) June 2015,July 2016
Ans. SCSI Bus:- The processor sends a command to the SCSI controller, which causes the
following sequence of event to take place:
1. The SCSI controller, acting as an initiator, contends for control of the bus.
2. When the initiator wins the arbitration process, it selects the target controller and hands
over control of the bus to it.
3. The target starts an output operation (from initiator to target); in response to this, the
initiator sends a command specifying the required read operation.
4. The target, realizing that it first needs to perform a disk seek operation, sends a message to
the initiator indicating that it will temporarily suspend the connection between them. Then
it releases the bus.
5. The target controller sends a command to the disk drive to move the read head to the first
sector involved in the requested read operation. Then, it reads the data stored in that sector
and stores them in a data buffer. When it is ready to begin transferring data to the initiator,
the target requests control of the bus. After it wins arbitration, it reselects the initiator
controller, thus restoring the suspended connection.
6. The target transfers the contents of the data buffer to the initiator and then suspends the
connection again. Data are transferred either 8 or 16 bits in parallel, depending on the width
of the bus.
7. The target controller sends a command to the disk drive to perform another seek operation.
Then, it transfers the contents of the second disk sector to the initiator as before. At the end
of this transfers, the logical connection between the two controllers is terminated.
8. As the initiator controller receives the data, it stores them into the main memory using the
DMA approach.
9. The SCSI controller sends as interrupt to the processor to inform it that the requested
operation has been completed
This scenario show that the messages exchanged over the SCSI bus are at a higher
level than those exchanged over the processor bus. In this context, a higher level means that the
messages refer to operations that may require several steps to complete, depending on the device.
Neither the processor nor the SCSI controller need be aware of the details of operation of the
particular device involved in a data transfer. In the preceding example, the processor need not be
involved in the disk seek operation.
11.In a computer system why a PCI bus is used? With a neat sketch, explain how the read
operation is performed along with the role of IRDY#/TRDY# on the PCI bus (8M) Dec 2014
,July 2016.
The PCI bus is a good example of a system bus that grew out of the need for
standardization. It supports the functions found on a processor bus bit in a standardized format that
is independent of any particular processor. Devices connected to the PCI bus appear to the
processor as if they were connected directly to the processor bus. An important feature that the
PCI pioneered is a plug-and-play capability for connecting I/O devices. To connect a new device,
the user simply connects the device interface board to the bus. The software takes care of the rest.
Data Transfer:-
In todays computers, most memory transfers involve a burst of data rather than just one
word. The reason is that modern processors include a cache memory. Data are transferred between
the cache and the main memory in burst of several words each. The words involved in such a
transfer are stored at successive memory locations. When the processor (actually the cache
controller) specifies an address and requests a read operation from the main memory, the memory
responds by sending a sequence of data words starting at that address. Similarly, during a write
operation, the processor sends a memory address followed by a sequence of data words, to be
written in successive memory locations starting at the address. The PCI is designed primarily to
support this mode of operation. A read or write operation involving a single word is simply treated
as a burst of length one.
The bus supports three independent address spaces: memory, I/O, and configuration. The
first two are self-explanatory. The I/O address space is intended for use with processors, such as
Pentium, that have a separate I/O address space. However, as noted , the system designer may
choose to use memory-mapped I/O even when a separate I/O address space is available. In fact,
this is the approach recommended by the PCI its plug-and-play capability. A 4-bit command that
accompanies the address identifies which of the three spaces is being used in a given data transfer
operation.
12.Draw the block diagram of universal bus(USB)structure connected to the host computer
Briefly explain all fields of packets that are used for communication between a host and a
device connected to an USB port. (8M) June 2014/July 2015
The USB supports two speeds of operation, called low-speed (1.5 megabits/s) and full-
speed (12 megabits/s). The most recent revision of the bus specification (USB 2.0) introduced a
third speed of operation, called high-speed (480 megabits/s). The USB is quickly gaining
acceptance in the market place, and with the addition of the high-speed capability it may well
become the interconnection method of choice for most computer devices.
Provides a simple, low-cost and easy to use interconnection system that overcomes the
difficulties due to the limited number of I/O ports available on a computer.
Accommodate a wide range of data transfer characteristics for I/O devices, including
telephone and Internet connections.
Enhance user convenience through a plug-and-play mode of operation
Port Limitation:-
The parallel and serial ports described in previous section provide a general-purpose point
of connection through which a variety of low-to medium-speed devices can be connected to a
computer. For practical reasons, only a few such ports are provided in a typical computer.
Device Characteristics:-
The kinds of devices that may be connected to a computer cover a wide range of
functionality. The speed, volume, and timing constraints associated with data transfers to and from
such devices vary significantly.
A variety of simple devices that may be attached to a computer generate data of a similar
nature low speed and asynchronous. Computer mice and the controls and manipulators used with
video games are good examples.
Plug-and-Play:- As computers become part of everyday life, their existence should become
increasingly transparent. For example, when operating a home theater system, which includes at
least one computer, the user should not find it necessary to turn the computer off or to restart the
system to connect or disconnect a device.
The plug-and-play feature means that a new device, such as an additional speaker, can be
connected at any time while the system is operating. The system should detect the existence of this
new device automatically, identify the appropriate device-driver software and any other facilities
needed to service that device, and establish the appropriate addresses and logical connections to
enable them to communicate. The plug-and-play requirement has many implications at all levels
in the system, from the hardware to the operating system and the applications software. One of the
primary objectives of the design of the USB has been to provide a plug-and-play capability.
The signaling convention on the PCI bus is similar to the one used, we assumed that the
master maintains the address information on the bus until data transfer is completed. But, this is
not necessary. The address is needed only long enough for the slave to be selected. The slave can
store the address in its internal buffer. Thus, the address is needed on the bus for one clock cycle
only, freeing the address lines to be used for sending data in subsequent clock cycles. The result is
a significant cost reduction because the number of wires on a bus is an important cost factor. This
approach in used in the PCI bus.
At any given time, one device is the bus master. It has the right to initiate data transfers by
issuing read and write commands. A master is called an initiator in PCI terminology. This is either
a processor or a DMA controller. The addressed device that responds to read and write commands
is called a target.
13. Draw the hardware components needed for connecting a keyboard to a process and
explain in brief. Jan 2014
ANS: The CPU (Central Processing Unit) is the 'brain' of the computer.
It's typically a square ceramic package plugged into the motherboard, with a large heat sink on
top (and often a fan on top of that heat sink).
All instructions the computer will process are processed by the CPU. There are many "CPU
architectures", each of which has its own characteristics and trade-offs. The dominant CPU
architectures used in personal computing are x86 and PowerPC. x86 is easily the most popular
processor for this class of machine (the dominant manufacturers of x86 CPUs
are Intel and AMD). The other architectures are used, for istance, in workstations, servers or
embedded systems CPUs contain a small amount of static RAM (SRAM) called a cache. Some
processors have two or three levels of cache, containing as much as several megabytes of
memory.
Dual Core
Some of the new processors made by Intel and AMD are Dual core. The Intel designation for
dual core are "Pentium D", "Core Duo" and "Core 2 Duo" while AMD has its "X2" series and
"FX-6x".
The core is where the data is processed and turned into commands directed at the rest of the
computer. Having two cores increases the data flow into the processor and the command flow
out of the processor potentially doubling the processing power, but the increased performance is
only visible with multithreaded applications and heavy multitasking.
Motherboard
The Motherboard (also called Mainboard) is a large, thin, flat, rectangular fiberglass board
(typically green) attached to the case. The Motherboard carries the CPU, the RAM,
the chipset and theexpansion slots (PCI, AGP - for graphics -, ISA, etc.).
The Motherboard also holds things like the BIOS (Basic Input Output System) and the CMOS
Battery (a coin cell that keeps an embbeded RAM in the motherboard -often NVRAM- powered
to keep various settings in effect).
RAM
Random Access Memory (RAM) is a memory that the microprocessor uses to store data during
processing. This memory is volatile (loses its contents at power-down). When a software
application is launched, the executable program is loaded from hard drive to the RAM. The
microprocessor supplies address into the RAM to read instructions and data from it.
14. List the SCSI bus signals with their functionalities. Jan 2014
Parallel SCSI (formally, SCSI Parallel Interface, or SPI) is one of the interface
implementations in the SCSI family. In addition to being a data bus, SPI is a parallel electrical
bus: There is one set of electrical connections stretching from one end of the SCSI bus to the
other. A SCSI device attaches to the bus but does not interrupt it. Both ends of the bus must
be terminated.
W0
5-Bit W132X32
10
Two 32 to 1
B W31
Multiplexers
I
T Sence/Write
A Circuitr
R
The 10-bit address is divided into two groups of 5 bits each to form the row and column addresses
E
for the cell array. A row address selects a row of 32 cells, all of which are accessed in parallel.
S One of these, selected by the column address, is connected to the external data lines by the input
and output multiplexers. This structure can store 1024 bits, can be implemented in a 16-pin chip.
S
2. Explain the working of a single-transistor dynamic memory cell. (10M) July 2016
The basic idea of dynamic memory is that information is stored in the form of a charge on
the capacitor. An example of a dynamic memory cell is shown below:
When the transistor T is turned on and an appropriate voltage is applied to the bit line,
information is stored in the cell, in the form of a known amount of charge stored on the capacitor.
After the transistor is turned off, the capacitor begins to discharge. This is caused by the capacitors
own leakage resistance and the very small amount of current that still flows through the transistor.
Hence the data is read correctly only if is read before the charge on the capacitor drops below some
threshold value. During a Read
operation, the bit line is placed in a high-impendance state, the transistor is turned on and a sense
circuit connected to the bit line is used to determine whether the charge on the capacitor is above
or below the threshold value. During such a Read, the charge on the capacitor is restored to its
original value and thus the cell is refreshed with every read operation.
3. Define memory latency and bandwidth in case of burst operation that is used for
transferring a block of data to or from synchronous DRAM memory unit (8M) June 2015
Bipolar as well as MOS memory cells using a flip-flop like structure to store information
can maintain the information as long as current flow to the cell is maintained. Such memories are
called static memories. In contracts, Dynamic memories require not only the maintaining of a
power supply, but also a periodic refresh to maintain the information stored in them. Dynamic
memories can have very high bit densities and very lower power consumption relative to static
memories and are thus generally used to realize the main memory unit.
5. With figure explain about direct mapping cache memory. Jan 2014
Introduction
Computer pioneers correctly predicted that programmers would want unlimited amounts of fast
memory. An economical solution to that desire is a memory hierarchy, which takes advantage of
locality and trade-offs in the cost-performance of memory technologies. The principle of locality,
presented in the first chapter, says that most programs do not access all code or data uniformly.
Locality occurs in time (temporal locality) and in space (spatial locality). This principle, plus the
guideline that for a given implementation technology and power budget smaller hardware can be
made faster, led to hierarchies based on memories of different speeds and sizes. Figure 2.1 shows
a multilevel memory hierarchy, including typical sizes and speeds of access.
Since fast memory is expensive, a memory hierarchy is organized into several levels - each smaller,
faster, and more expensive per byte than the next lower level, which is farther from the processor.
The goal is to provide a memory system with cost per byte almost as low as the cheapest level of
memory and speed almost as fast as the fastest level. In most cases (but not all), the data contained
in a lower level are a superset of the next higher level. This property, called the inclusion property,
is always required for the lowest level of the hierarchy, which consists of main memory in the case
of caches and disk memory in the case of virtual memory.
MODULE-VI: ARITHMETIC
1. Show the organization of virtual memory address translation based in fixed-length pages
and explains its working. (10M) June 2014, July 2015
Ans. Memory management by paging:- Fig 3 shows a simplified mechanism for virtual address
translation in a paged MMU. The process begins in a manner similar to the segmentation process.
The virtual address composed of a high order page number and a low order word number is applied
to MMU. The virtual page number is limit checked to be certain that the page is within the page
table, and if it is, it is added to the page table base to yield the page table entry. The page table
entry contains several control fields in addition to the page field. The control fields may include
access control bits, a presence bit, a dirty bit and one or more use bits, typically the access control
field will include bits specifying read, write and perhaps execute permission. The presence bit
indicates whether the page is currently in main memory. The use bit is set upon a read or write to
the specified page, as an indication to the replaced algorithm in case a page must be replaced.
If the presence bit indicates a hit, then the page field of the page table entry will contain
the physical page number. If the presence bit is a miss, which is page fault, then the page field of
the page table entry which contains an address is secondary memory where the page is stored. This
miss condition also generates an interrupt. The interrupt service routine will initiate the page fetch
from secondary memory and with also suspended the requesting process until the page has been
bought into main memory. If the CPU operation is a write hit, then the dirty bit is set. If the CPU
operation is a write miss, then the MMU with begin a write allocate process.
2. Explain the design of a 4-bit carry-look ahead adder. (8M) Dec 2015, June 2016
Ans.
The above expressions Gi and Pi are called carry generate and propagate functions for stage
i. If the generate function for stage i is equal to 1, then ci+1 = 1, independent of the input carry, ci.
This occurs when both xi and yi are 1. The propagate function means that an input carry will
produce an output carry when either xi or yi or both equal to 1. Now, using Gi & Pi functions we
can decide carry for ith stage even before its previous stages have completed their addition
operations. All Gi and Pi functions can be formed independently and in parallel in only one gate
delay after the Xi and Yi inputs are applied to an n-bit adder. Each bit stage contains an AND gate
to form Gi, an OR gate to form Pi and a three-input XOR gate to form si. However, a much simpler
circuit can be derived by considering the propagate function as Pi = xi yi, which differs from Pi
= xi + yi only when xi = yi =1 where Gi = 1 (so it does not matter whether Pi is 0 or 1). Then, the
basic diagram in Figure-5 can be used in each bit stage to predict carry ahead of any stage
completing its addition.
Further, Ci-1 = (Gi-2 + Pi-2Ci-2) and so on. Expanding in this fashion, the final carry expression
can be written as below;
C i+1 = Gi + PiG i-1 + PiP i-1 G i-2 + + Pi P i-1 P 1G0 + Pi P i-1 P0G0
Thus, all carries can be obtained in three gate delays after the input signals Xi, Yi and Cin
are applied at the inputs. This is because only one gate delay is needed to develop all Pi and Gi
signals, followed by two gate delays in the AND-OR circuit (SOP expression) for ci + 1. After a
further XOR gate delay, all sum bits are available.
Therefore, independent of n, the number of stages, the As it is clear from the previous
discussion that a parallel adder is considerably slow & a fast adder circuit must speed up the
generation of the carry signals, it is necessary to make the carry input to each stage readily available
along with the input bits. This can be achieved either by propagating the previous carry or by
generating a carry depending on the input bits & previous carry. The logic expressions for si (sum)
and c i+1 (carry-out) of stage ith are
Now, consider the design of a 4-bit parallel adder. The carries can be implemented as
;i = 0
;i = 1
;i = 2
;i = 3
The complete 4-bit adder is shown in Figure 5b where the B cell indicates Gi, Pi & Si generator.
The carries are implemented in the block labeled carry look-ahead logic. An adder implemented
in this form is called a carry look ahead adder. Delay through the adder is 3 gate delays for all
carry bits and 4 gate delays for all sum bits. In comparison, note that a 4-bit ripple-carry adder
requires 7 gate delays for S3(2n-1) and 8 gate delays(2n) for c4.
3. Answer the following w.r.t. to magnetic disk,the secondary storage device: (6M)
i) seek time
1. Ans. Seek time: - Is the average time required to move the read/write head to the desired
track. Actual seek time which depend on where the head is when the request is received
and how far it has to travel, but since there is no way to know what these values will be
when an access request is made, the average figure is used. Average seek time must be
determined by measurement. It will depend on the physical size of the drive components
and how fast the heads can be accelerated and decelerated. Seek times are generally in the
range of 8-20 m sec and have not changed much in recent years.
2. Access time:- Is equal to seek time plus rotational latency.
4. With figure explain circuit arrangements for binary division . Jan 2014
A circuit that implements division by this longhand method operates as follows: It
positions the divisor appropriately with respect to the dividend and performs a subtraction.
If the remainder is zero positive, a quotient bit of 1 is determined, the remainder isextended
by another bit of the dividend, the divisor is repositioned, and sub- traction is performed.
On the other hand, if the remainder is negative, a quotient bit of 0 is determined,
the dividend is restored by adding back the divisor, and the divisor H
arithmetic formats: sets of binary and decimal floating-point data, which consist of finite
numbers (including signed zeros and subnormal numbers), infinities, and special "not a
number" values (NaNs)
interchange formats: encodings (bit strings) that may be used to exchange floating-point data
in an efficient and compact form
rounding rules: properties to be satisfied when rounding numbers during arithmetic and
conversions
operations: arithmetic and other operations on arithmetic formats
exception handling: indications of exceptional conditions (such as division by zero,
overflow,etc.)
The standard also includes extensive recommendations for advanced exception handling,
additional operations (such as trigonometric functions), expression evaluation, and for achieving
reproducible results.
The standard is derived from and replaces IEEE 754-1985, the previous version, following a
seven-year revision process, chaired by Dan Zuras and edited by Mike Cowlishaw. The binary
formats in the original standard are included in the new standard along with three new basic
formats (one binary and two decimal). To conform to the current standard, an implementation
must implement at least one of the basic formats as both an arithmetic format and an interchange
format.
Booth Algorithm
The Booth algorithm generates a 2n-bit product and both positive and negative 2's-
complement n-bit operands are uniformly treated. To understand this algorithm, consider a
multiplication operation in which the multiplier is positive and has a single block of 1s, for
example, 0011110(+30). To derive the product, as in the normal standard procedure, we could add
four appropriately shifted versions of the multiplicand,. However, using the Booth algorithm, we
can reduce the number of required operations by regarding this multiplier as the difference between
numbers 32 & 2 as shown below;
0 1 0 0 0 0 0 (32)
0 0 0 0 0 1 0 (-2)
This suggests that the product can be generated by adding 25 times the multiplicand to the
2's-complement of 21 times the multiplicand. For convenience, we can describe the sequence of
required operations by recoding the preceding multiplier as 0 +1000 - 10. In general, in the Booth
scheme, -1 times the shifted multiplicand is selected when moving from 0 to 1, and +1 times the
shifted multiplicand is selected when moving from 1 to 0, as the multiplier is scanned from right
to left.
0 1 0 1 1 0 1
0 0+1 +1 +1 +1 0
0 0 0 0 0 0 0
0 1 0 1 1 0 1
0 1 0 1 1 0 1
0 1 0 1 1 0 1
01 0 1 1 0 1
Normal Multiplication
0 1 0 1 1 0 1
0 0+1 +1 +1 +1 0
0 0 0 0 0 0 00 0 0 0 0 0 0
1 1 1 1 1 1 1 01 0 0 1 1
0 0 0 0 0 0 0 00 0 0 0
0 0 0 0 0 0 0 00 0 0
Figure 10 illustrates the normal and the Booth algorithms for the said example. The Booth
algorithm clearly extends to any number of blocks of 1s in a multiplier, including the situation in
which a single 1 is considered a block. See Figure 11a for another example of recoding a multiplier.
The case when the least significant bit of the multiplier is 1 is handled by assuming that an implied
0 lies to its right. The Booth algorithm can also be used directly for negative multipliers, as shown
in Figure 11a.
To verify the correctness of the Booth algorithm for negative multipliers, we use the following
property of negative-number representations in the 2's-complement
1.If the sign of A is 0, shift A and Q left one bit position and subtract M fromA; otherwise, shift A
and Q left and add M to A.
Step 2 is needed to leave the proper positive remainder in A at the end of the n cycles of
Step 1. The logic circuitry in Figure 17 can also be used to perform this algorithm. Note that the
Restore operations are no longer needed, and that exactly one Add or Subtract operation is
performed per cycle. Figure 19 shows how the division example in Figure 18 is executed by the
no restoring-division algorithm.
There are no simple algorithms for directly performing division on signed operands that are
comparable to the algorithms for signed multiplication. In division, the operands can be
preprocessed to transform them into positive values. After using one of the algorithms just
discussed, the results are transformed to the correct signed values, as necessary.
3.Write the algorithm for binary division using restoring division method(8M) Dec2015,July
2016
Figure 17 shows a logic circuit arrangement that implements restoring division. Note its similarity
to the structure for multiplication that was shown in Figure 8. An n-bit positive divisor is loaded
into register M and an n-bit positive dividend is loaded into register Q at the start of the operation.
Register A is set to 0. After the division is complete, the n-bit quotient is in register Q and the
remainder is in register A. The required subtractions are facilitated by using 2's-complement
arithmetic. The extra bit position at the left end of both A and M accommodates the sign bit during
subtractions. The following algorithm performs restoring division.
4.. List the rules for addition, subtraction, multiplication and division of floating point
numbers (8M) June 2014/July 2015
Floating point arithmetic is an automatic way to keep track of the radix point. The
discussion so far was exclusively with fixed-point numbers which are considered as integers, that
is, as having an implied binary point at the right end of the number. It is also possible to assume
that the binary point is just to the right of the sign bit, thus representing a fraction or any where
else resulting in real numbers. In the 2's-complement system, the signed value F, represented by
the n-bit binary fraction
F(B) = -bo x 2 + b-1 x 2-1 +b-2x2-2 + ... + b-(n-X) x 2-{n~l) where the range of F is
-1 F 1 -2-(n-1). Consider the range of values representable in a 32-bit, signed, fixed-point format.
Interpreted as integers, the value range is approximately 0 to 2.15 x 109. If we consider them to
be fractions, the range is approximately 4.55 x 10-10 to 1. Neither of these ranges is sufficient
for scientific calculations, which might involve parameters like Avogadro's number (6.0247 x 1023
mole-1) or Planck's constant (6.6254 x 10-27erg s). Hence, we need to easily accommodate both
very large integers and very small fractions. To do this, a computer must be able to represent
numbers and operate on them in such a way that the position of the binary point is variable and is
automatically adjusted as computation proceeds. In such a case, the binary point is said to float,
and the numbers are called floating-point numbers. This distinguishes them from fixed-point
numbers, whose binary point is always in the same position.
Because the position of the binary point in a floating-point number is variable, it must be
given explicitly in the floating-point representation. For example, in the familiar decimal scientific
notation, numbers may be written as 6.0247 x 1023, 6.6254 -10-27, -1.0341 x 102, -7.3000 x 10-14,
and so on. These numbers are said to be given to five significant digits. The scale factors (1023, 10-
27
, and so on) indicate the position of the decimal point with respect to the significant digits. By
convention, when the decimal point is placed to the right of the first (nonzero) significant digit,
the number is said to be normalized. Note that the base, 10, in the scale factor is fixed and does
not need to appear explicitly in the machine representation of a floating-point number. The sign,
the significant digits, and the exponent in the scale factor constitute the representation. We are thus
motivated to define a floating-point number representation as one in which a number is represented
by its sign, a string of significant digits, commonly called the mantissa, and an exponent to an
implied base for the scale factor.
5.Draw and explain the single-bus organization of the data path inside a processor.
Jan 2014
Ans: A datapath is a collection of functional units, such as arithmetic logic units or multipliers,
that perform data processing operations. It is a central part of many central processing
units (CPUs) along with the control unit, which largely regulates interaction between the
datapath and the data itself, usually stored in registers or main memory.
Recently, there has been growing research in the area of reconfigurable datapathsdatapaths
that may be re-purposed at run-time using programmable fabricas such designs may allow for
more efficient processing as well as substantial power savings
Examples
Let us consider addition as an Arithmetic operation and Retrieving data from memory in detail.
Example 1) Arithmetic addition : contents of register reg1 and reg2 are added and the result is
stored in reg3
Sequence of operations:
1. reg1out,Xin
2. reg2out,choose X,ADDITION,Yin
3. Yout,reg3in
The control signals written in one line are executed in the same clock cycle. all other signals
remain untouched. So, in the first step the contents of register1 are written into the register X
through the bus. In the second stage the content of register2 is placed onto the bus and
theMultiplexer is made to choose input X as the contents of reg1 are stored in register X. The
ALU then adds the contents in the register X and reg2 and stores the result of the addition in the
special temporary register Y. In the final step the result strored in Y is sent over to the register
reg3 over the internal processor bus. Only one register can output its data onto bus in one step
Ans: A branch is an instruction in a computer program that may, when executed by a computer,
cause the computer to begin execution of a different instruction
sequence. Branch (or branching,branched) may also refer to the act of beginning execution of a
different instruction sequence due to executing a branch instruction. A branch instruction can be
either an unconditional branch, which always results in branching, or a conditional branch,
which may or may not cause branching depending on some condition.
When executing (or "running") a program, a computer will fetch and execute instructions in
sequence (in their order of appearance in the program) until it encounters a branch instruction. If
the instruction is an unconditional branch, or it is conditional and the condition is satisfied, the
computer will branch (fetch its next instruction from a different instruction sequence) asspecified
by the branch instruction. However, if the branch instruction is conditional and the condition is
not satisfied, the computer will not branch; instead, it will continue executing the current
instruction sequence, beginning with the instruction that follows the conditional branch
instruction.
7. Write and explain the control sequences for the execution of an unconditional branch
instruction (10M) June 2015
The offset X used in a branch instruction is usually the difference between the branch target address
and the address immediately following the branch instruction.
For example, if the branch instruction is at location 2000 and if the branch target address
is 2050, the value of X must be 46. The reason for this can be readily appreciated from the control
sequence in Figure 7. The PC is incremented during the fetch phase, before knowing the type of
instruction being executed. Thus, when the branch address is computed in step 4, the PC value
used is the updated value, which points to the instruction following the branch instruction in the
memory.
Consider now a conditional branch. In this case, we need to check the status of the condition
codes before loading a new value into the PC. For example, for a Branch-on-negative (Branch<0)
instruction, step 4 is replaced with
8. Explain with block diagram the basic organization of a micro programmed control
unit(10M) Dec 2015
The control signals required inside the processor can be generated using a control step
counter and a decoder/ encoder circuit. Now we discuss an alternative scheme, called micro
programmed control, in which control signals are generated by a program similar to machine
language programs.
First, we introduce some common terms. A control word (CW) is a word whose individual
bits represent the various control signals in Figure 12. Each of the control steps in the control
sequence of an instruction defines a unique combination of Is and Os in the CW. The CWs
corresponding to the 7 steps of Figure 6 are shown in Figure 15. We have assumed that Select Y
is represented by Select = 0 and Select4 by Select = 1. A sequence of CWs corresponding to the
control sequence of a machine instruction constitutes the micro routine for that instruction, and the
individual control words in this micro routine are referred to as microinstructions.
The micro routines for all instructions in the instruction set of a computer are stored in a
special memory called the control store. The control unit can generate the control signals for any
instruction by sequentially reading the CWs of the corresponding micro routine from the control
store. This suggests organizing the control unit as shown in Figure 16. To read the control words
sequentially from the control store, a micro program counter (PC) is used. Every time a new
instruction is loaded into the IR, the output of the block labeled "starting address generator" is
loaded into the PC. The PC is then automatically incremented by the clock, causing successive
microinstructions to be read from the control store. Hence, the control signals are delivered to
various parts of the processor in the correct sequence.
One important function of the control unit cannot be implemented by the simple organization
in Figure 16. This is the situation that arises when the control unit is required to check the status
of the condition codes or external inputs to choose between alternative courses of action. In the
case of hardwired control, this situation is handled by including an appropriate logic function, in
the encoder circuitry. In micro programmed control, an alternative approach is to use conditional
branch microinstructions. In addition to the branch address, these microinstructions specify which
of the external inputs, condition codes, or, possibly, bits of the instruction register should be
checked as a condition for branching to take place.
Ans. However, this scheme has one serious drawback assigning individual bits to each
control signal results in long microinstructions because the number of required signals is usually
large. Moreover, only a few bits are set to 1 (to be used for active gating) in any given
microinstruction, which means the available bit space is poorly used. Consider again the simple
processor of Figure 2, and assume that it contains only four general-purpose registers, R0, Rl, R2,
and R3. Some of the connections in this processor are permanently enabled, such as the output of
the IR to the decoding circuits and both inputs to the ALU. The remaining connections to various
registers require a total of 20 gating signals. Additional control signals not shown in the figure are
also needed, including the Read, Write, Select, WMFC, and End signals. Finally, we must specify
the function to be performed by the ALU. Let us assume that 16 functions are provided, including
Add, Subtract, AND, and XOR. These functions depend on the particular ALU used and do not
necessarily have a one-to-one correspondence with the machine instruction OP codes. In total, 42
control signals are needed.
If we use the simple encoding scheme described earlier, 42 bits would be needed in each
microinstruction. Fortunately, the length of the microinstructions can be reduced easily. Most
signals are not needed simultaneously, and many signals are mutually exclusive. For example,
only one function of the ALU can be activated at a time. The source for a data transfer must be
unique because it is not possible to gate the contents of two different registers onto the bus at the
same time. Read and Write signals to the memory cannot be active simultaneously. This suggests
that signals can be grouped so that all mutually exclusive signals are placed in the same group.
Thus, at most one micro operation per group is specified in any microinstruction. Then it is
possible to use a binary coding scheme to represent the signals within a group. For example, four
bits suffice to represent the 16 available functions in the ALU. Register output control signals can
be placed in a group consisting of PCout, MDRout, Zout, Offsetout, R0out Rlout, R2out, R3out, and
TEMPout. Any one of these can be selected by a unique 4-bit code.
Further natural groupings can be made for the remaining signals. Figure 19 shows an
example of a partial format for the microinstructions, in which each group occupies a field large
enough to contain the required codes. Most fields must include one inactive code for the case in
which no action is required. For example, the all-zero pattern in Fl indicates that none of the
registers that may be specified in this field should have its contents placed on the bus. An inactive
code is not needed in all fields. For example, F4 contains 4 bits that specify one of the 16 operations
performed in the ALU. Since no spare code is included, the ALU is active during the execution of
every microinstruction. However, its activity is monitored by the rest of the machine through
register Z, which is loaded only when the Zin signal is activated.
Grouping control signals into fields requires a little more hardware because decoding
circuits must be used to decode the bit patterns of each field into individual control signals. The
cost of this additional hardware is more than offset by the reduced number of bits in each
microinstruction, which results in a smaller control store. In Figure 19, only 20 bits are needed to
store the patterns for the 42 signals.
So far we have considered grouping and encoding only mutually exclusive control signals.
We can extend this idea by enumerating the patterns of required signals in all possible
microinstructions. Each meaningful combination of active control signals can.
10. List out differences between shared memory multiprocessor and cluster July 2015,Jan
2014
Ans: In a multiple processor computer, an important issue is: How do processors coordinate to
solve a problem? Processors must have the ability to communicate with each other in order to
cooperatively complete a task. There are two general approaches to address this problem.
One option uses a single address space. Systems based on this concept, otherwise known as
shared-memory systems, allow processor communication through variables stored in a shared
address space.
The other alternative employs a scheme by which each processor has its own memory module.
Such a distributed-memory system (cluster) is constructed by connecting each component with a
high-speed communications network. Processors communicate to each other over the network.
Parallel computing is a form of computation in which many calculations are carried out
simultaneously, operating on the principle that large problems can often be divided into smaller
ones, which are then solved concurrently("in parallel"). There are several different forms of
parallel computing: bit-level,instruction level, data, and tasparallelism. Parallelism has been
employed for many years, mainly in high-performance computing, but interest in it has grown
lately due to the physical constraints preventing frequency scaling. As power consumption (and
consequently heat generation) by computers has become a concern in recent years,[3] parallel
computing has become the dominant paradigm in computer architecture, mainly in the form
of multicore processors.
Parallel computers can be roughly classified according to the level at which the hardware
supports parallelism, with multi-core and multi-processor computers having multiple processing
elements within a single machine, while clusters, MPPs, and grids use multiple computers to work
on the same task. Specialized parallel computer architectures are sometimes used alongside
traditional processors, for accelerating specific tasks.
Parallel computer programs are more difficult to write than sequential ones, because
concurrency introduces several new classes of potential software bugs, of which race conditionsare
the most common. Communication and synchronization between the different subtasks are
typically some of the greatest obstacles to getting good parallel program performance.
Assignment Questions
MODULE-I
Basic Structure of Computers
MODULE-II
Input/output Organization
2. With neat block diagram , explain any two methods of handling multiple I/O devices
3. Define exceptions. Explain kinds of exceptions.*
4. What is the necessity of DMA controller? Explain the methods of bus arbitration.
5. Show the possible register configurations in a DMA interface, explain direct memory
a. access(DMA)*
6. What is the necessity of bus arbitration? Explain the different methods of bus arbitration.
7. Compare programmed I/O. interrupt driven I/O and DMA based I/O
8. Compare serial and parallel interfaces for efficiency and complexity with example
9. What are the features of SCSI bus? Write a note on arbitration and selection on SCSI bus
10. Describe the split bus operation. How can it be connected to two fast devices and one
slow device?
MODULE-III
Memory System
1. How read and write operation takes place in 1k *1 memory chip ?explain
2. With the block diagram explain the operation of a 16-megabit DRAM chip configured as
2M*8.
3. Mention any two differences b/w static and dynamic RAMs . explain the internal
organization of a memory chip consisting of 16 words of 8 bit each.*
4. Which are the various factors to be considered in the choice of a memory chip ? explain
5. Give the organization of a 2M*32 memory module using 512k*8 static memory chips
6. Discuss ,the different types of RAMs bring out their salient features
MODULE-IV
Arithmetic
(ii). Show how to implement a full- adder using two half-adder and external
5. Explain how a 16-bit carry look ahead adder can be built from 4 bit adder.
6. How do you design FAST ADDERS? Explain a 4 bit carry look ahead adder.
MODULE-V
Basic Processing Unit
2. Write and explain control sequences for execution of the instruction SUB R1, (R4).
PART A
1 a. What is performance measurement? Explain the overall SPEC rating for the computer in a
program suite. (04 Marks)
b. List and explain the technological features and devices improvement made during different
generations of computers.
Important Note : 1. On completing your answers, compulsorily draw diagonal cross lines on the remaining blank pages.
(08 Marks)
c. Mention four types of operations to be performed by instructions in a computer. Explain
with basic types of instruction formats to carry out C [A] + [B]. (08 Marks)
2 a. Define an addressing mode. Explain the following addressing modes, with example :
immediate, indirect, index, relative and auto increment. (10 Marks)
b. What is a stack frame? Explain a commonly used layout for information in a subroutine
stack frame.
c. Explain shift and rotate operations, with example. (10 Marks)
4 a. Explain with a neat block diagram, the hardware components needed for connecting a
keyboard to a processor. (08 Marks)
b. Briefly discuss the main phases involved in the operation of SCSI bus. (06 Marks)
c. Explain the tree structure of USB with split bus operation. (06 Marks)
PART B
6 a. Explain with figure the design and working of a 16-bit carry look ahead adder built form
4-bit adders. (06 Marks)
b. Explain Booth algorithm. Apply Booth algorithm to multiply the signed numbers +13 and
6. (10 Marks)
c. Differentiate between restoring and non-restoring division. (04 Marks)
1 of 2
10CS46
7 a. List out the actions needed to execute the instruction add(R3), R1. Write and explain
sequence of control steps for the execution of the same. (10 Marks)
b. With a neat block diagram, explain hardwired control unit. Show the generation Zin and End
control signals. (10 Marks)
8 a. State Amdahls law. Suppose a program runs in 100 sec on a computer with multiply
operation responsible for 80 sec of this time, how much it requires to improve the speed of
multiplication, if the program has to run 5 times faster? Justify your answer (06 Marks)
b. Explain the classic organization of a shared memory multiprocessor. (06 Marks)
c. What is hardware multithreading. Explain the different approaches to hardware
multithreading. (08 Marks)
*****
2 of 2
Click to Download VTU CAMPUS Android APP.
Get : VTU Alerts, Notes, Question Papers
Events, Circulars and a lot more!