COA Course File
COA Course File
Course File
Prepared by
Assistant Professor
Mrs Asmita P
Ambekar
Mrs V Divya
Code: D0506
AcademicYear2025-26
Regulations: MR24
MALLA REDDY ENGINEERING COLLEGE
(Autonomous)
Maisammaguda, hulapally,Secunderabad. 500100
INDEX
(Autonomous)
Department of Computer Science and Engineering-CSE
SECTION-A:CLASSDOCUMENTS
Institute Vision
To be a premier centre of professional education and research, offering quality programs in
a socio-economic and ethical ambience.
Institute Mission
1. To impart knowledge of advanced technologies using state-of-the-art
infrastructural facilities.
2. To in cultivate innovation and best practices in education, training and research.
3. To meet changing socio-economic needs in an ethical ambience.
Department of Computer Science and Engineering
(CSE)
Institute Vision and Mission
Vision
Mission
To impart knowledge of advanced technologies using state-of-the-art
infrastructural facilities.
To in cultivate innovation and best practices in education, training and research.
To meet changing socio-economic needs in an ethical ambience.
M1- To impart quality education and research to undergraduate and postgraduate students in Computer
Science and Engineering.
M2- To encourage innovation and best practices in Computer Science and Engineering
utilizing state-of-the-art facilities.
M3- To develop entrepreneurial spirit and knowledge of emerging technologies based on ethical
values and social relevance.
Objectives
To impart with a sound knowledge in scientific and engineering technologies necessary to
formulate, analyze, design and implement solutions to computer technology related
problems.
To carry out research in frontier areas of computer science and engineering with the capacity
to learn independently throughout life to develop new technologies.
To train to exhibit technical, communication and project management skills in their
profession and follow ethical practices.
To possess leadership and team working skills to become a visionary and an inspirational leader and
entrepreneur.
PEO’s
PEO1
To impart with a sound knowledge in scientific and engineering technologies necessary to formulate,
analyze, design and implement solutions to computer technology related problems.
PEO2
To carry out research in frontier areas of computer science and engineering with the capacity to learn
independently throughout life to develop new technologies.
PEO3
To train to exhibit technical, communication and project management skills in their Profession and
follow ethical practices.
PEO4
To possess leadership and team working skills to become a visionary and an inspirational leader and
entrepreneur.
PO’s
PO2 - Problem Analysis: Identify, formulate, review research literature and analyze complex
engineering problems reaching substantiated conclusions using first principles of mathematics, natural
sciences, and engineering sciences.
PO3 - Design/Development of Solutions: Design solutions for complex engineering problems and
design system components or processes that meet the specified needs with appropriate consideration for
the public health and safety, and the cultural, societal, and environmental considerations.
PO4 - Conduct Investigations of Complex Problems :Use research-based knowledge and research
methods including design of experiments, analysis and interpretation of data, and synthesis of the
information to provide valid conclusions.
PO5 - Modern Tool Usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modeling to complex engineering activities with an
understanding of the limitations.
PO6 - The Engineer and Society: Apply reasoning informed by the contextual knowledge to assess
societal, health, safety, legal and cultural issues and the consequent responsibilities relevant to the
professional engineering practice.
PO7- Environment and Sustainability :Understand the impact of the professional engineering solutions
in societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable
development.
PO8 - Ethics: Apply ethical principles and commit to professional ethics and responsibilities and normsof
the engineering practice.
PO9 - Individual and Teamwork: Function effectively as an individual and as a member or leader in
diverse teams, and in multi disciplinary settings.
PO11 - Project Management and Finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a member and leader in a
team, to manage projects and in multi disciplinary environments.
PO12 - Life-long Learning: Recognize the need for and have the preparation and ability to engage in
independent and life-long learning in the broadest context of technological change.
PSO’s
PSO1 - Apply the knowledge gained during the course of the program from mathematics, basics
Computing, Basic Sciences and all computer science courses in particular to identify, formulate and solve
real life complex engineering problems faced in industries and /or during research work with due
consideration for the public health and safety, in the context of cultural, societal, and environmental
situations.
PSO2 - Provide socially acceptable technical solutions to complex computer science engineering problem
with the application of modern and appropriate techniques for sustainable development relevant to
professional engineering practice.
PSO3 - Comprehend and write effective project in multi disciplinary environment in the context of
changing technologies.
Academic Calendar
2025-26
MALLAREDDYENGINEERINGCOLLEGE B.Tech.
Onwards
(Autonomous) III Semester
(MR-24)
Code: D0506 Computer Organization and Architecture L T P
CSE
Credits:3 3 - -
Objectives
1. The purpose of the course is to introduce principles of Digital fundamentals computer organization
and the basic architectural concepts.
2. It begins with basic organization, design, and programming of a simple digital computer and
introduces simple register transfer language to specify various computer operations.
3. Topics include computer arithmetic, instruction set design, micro programmed control Module,
pipelining and vector processing, memory organization and I/O systems, and multiprocessors.
MODULE– V [9 Periods]
Reduced Instruction Set Computer: CISC Characteristics, RISC Characteristics. Pipeline and Vector
Processing: Parallel Processing, Pipelining, Arithmetic Pipeline, Instruction Pipeline, RISC Pipeline,
Vector Processing, Array Processor. Multi Processors characteristics of Multiprocessors, Interconnection
Structures, Inter processor arbitration, Inter processor communication and synchronization, Cache
Coherence.
Outcomes:
1. Understandthebasicsofinstructionssetsandtheirimpactonprocessordesign.
2. DemonstrateanunderstandingofthedesignofthefunctionalModulesofadigital computer system.
3. Evaluatecostperformanceanddesigntrade-offsindesigningandconstructinga computer processor
including memory.
4. Design a pipelinefor consistentexecution ofinstructionswithminimum hazards.
5. Recognizeandmanipulaterepresentationsofnumbersstoredindigitalcomputers.
Textbook:
1. ComputerSystemArchitecture,M.MorisMano,3rdEdition, Pearson/PHI.
References:
1. ComputerOrganization,CarHamacher,ZvonksVranesic,SafeaZaky,5thEdition, McGraw Hill.
2. ComputerOrganizationandArchitecture,WilliamStallings6thEdition,Pearson/PHI.
3. StructuredComputerOrganization,AndrewS.Tanenbaum,4thEdition,PHI/Pearson.
E-Resources:
1. https://books.google.co.in/books?isbn=8131700704
2. http://ndl.iitkgp.ac.in/document/yVCWqd6u7wgye1qwH9xY7Eh9eBOsT1ELoYpKlg_xn
grkluevXOJLs1TbxS8q2icgUs3hL4_KAi5So5FgXcVg
3. http://ndl.iitkgp.ac.in/document/yVCWqd6u7wgye1qwH9xY7xAYUzYSlXl4zudlsolr-
e7wQNrNXLxbgGFxbkoyx1iN3YbHuFrzI2jc_70rWMEwQ
4. http://nptel.ac.in/courses/106106092/
CO-PO,PSOMapping
(3/2/1indicatesstrengthofcorrelation)3-Strong,2-Medium,1-Weak
ProgrammeOutcomes(POs) PSOs
COs
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
CO1 3 2 2 2
CO2 2 3 1
CO3 2 2 3 2 2 2
CO4 3
CO5 3
Bloom’s Taxonomy Triangle
The Graduate Attributes for UG Engineering Student
2. ProblemAnalysis: Identify,formulate,researchliteratureandanalyzecomplexengineering
problems reaching substantiated conclusions using first principles of mathematics, natural
sciences and engineering sciences.
5. Modern Tool Usage: Select and apply appropriate techniques, resources, and modern
engineeringand IT tools,includingprediction and modeling,to broadly-definedengineering
activities, with an understanding of the limitations.
6. The Engineer and society: Demonstrate understanding of the societal, health, safety, legal
and cultural issues, and the consequent responsibilities relevant to engineering technology
practice.
8. Ethics: Understand and commit to professional ethics and responsibilities and norms of
engineering technology practice.
12. Life-long learning: Recognize the need for, and have the ability to engage in independent
and life-long learning in specialized technologies.
MALLAREDDYENGINEERING COLLEGE
(Autonomous)
Maisammaguda,Dhullapally,PostviaKompally, Secunderabad
– 500100.
LESSONPLAN
AcademicYear2025-2026
Department: ComputerScienceandEngineering-CS
CourseTitle: ComputerOrganizationandArchitecture CourseCode D0506
Year/ II / I Sections CSE
Semester:
Category: Core Course
DLD
Prerequisites
Knowledge
Duration: One semester CreditUnits: 3
Class/ 3/0/0 [LTP]
Laboratory
Schedule :
MatchingPOsandPSOs
Curriculum PO:PO1,PO2,PO3,PO6, PO10,PO12
gap : PSO:PSO1,PSO2
This course enable the students to understand the basic fundamentals of Artificial
Intelligence, determine various problem solving strategies, understand the logic
concepts, different approaches to represent the knowledge, develop the expert
Course
Objectives: systemsin various phases andits applications, apply thefuzzylogic in various
problemsolving techniques
Texts&References: R1-ComputerOrganization-CarlHamacher,ZvonksVranesic,SafeaZaky,FifthEdition
(*recommended text R2- Structured Computer Organization- Andrew S.Tanenbaum, Fourth Edition,
book(s)) PHI/Pearson.
1.
E-RESOURCES:
1.http://ndl.iitkgp.ac.in/document/xttk-4kfhvUwVlXBW-
RPf64_TFk2i4LJhgQFPQWAEtZobbm3twyubjRA1YOe9WVwkN2qGcxBwdHaPd
i_mMQ 2.https://ndl.iitkgp.ac.in/result?q={"t":"search","k":"object%20oriented
%20program ming", "s":["type=\"video\""],"b":{"filters":[]}}
3.http://www.rehancodes.com/files/oop-using-c++-by-joyce-farrell.pdf
4. http://www.nptel.ac.in/courses/106103115/36
Assignments
Student
MidtestIandII
Assessments:
Final examination
Outcome Assignmentsandexaminations
Assessment: Course evaluation
Peri
Teaching
od Ref. Remar
S. Date TitleoftheTopic Actual Date Methodolo
Req Boo ks
No g ies
d. k
TB1
Chalk,
7 1 Arithmeticlogicshiftunit Black
Board,PPT
TB1
Chalk,
TimingandControl,
9 Black
Instruction cycle
Board,PPT
MicroProgrammedControl
MODULE-II TheoryClasses:09
TB1 Chalk,
11 1 Control memory Black
Board,PPT
TB1 Chalk,
12 1 Addresssequencing Black
Board,PPT
TB1 Chalk,
microprogramexample,design of
13 1 Black
control unit
Board,PPT
TB1 Chalk,
CentralProcessingUnit:General
14 1 Black
Register Organization
Board,PPT
TB1 Chalk,
15 1 InstructionFormats, Black
Board,PPT
TB1 Chalk,
16 1 Addressingmodes Black
Board,PPT
TB1 Chalk,
17 1 DataTransferandManipulation Black
Board,PPT
TB1 Chalk,
18 1 ProgramControl. Black
Board,PPT
Data Representation
MODULE-III TheoryClasses:10
Datatypes, Complements TB1 Chalk,
19 1 Black
Board,PPT
1 TB1 Chalk,
28 Asynchronousdata transfer Black
Board,PPT
1 TB1 Chalk,
29 Modesof Transfer Black
Board,PPT
1 TB1 Chalk,
PriorityInterruptDirectmemory
30 Black
Access
Board,PPT
1 TB1 Chalk,
MemoryOrganization:Memory
31 Black
Hierarchy,
Board,PPT
TB1 Chalk,
32 MainMemory Black
Board,PPT
TB1 Chalk,
33 Auxiliarymemory Black
Board,PPT
TB1 Chalk,
34 Associate Memory Black
Board,PPT
TB1 Chalk,
35 CacheMemory Black
Board,PPT
ReducedInstructionSet Computer
MODULEV TheoryClasses:09
TB1 Chalk,
36 2 CISCCharacteristics Black
Board,PPT
TB1 Chalk,
37 1 RISCCharacteristics Black
Board,PPT
TB1 Chalk,
PipelineandVectorProcessing:
38 1 Black
Parallel Processing
Board,PPT
TB1 Chalk,
39 1 Pipelining Black
Board,PPT
TB1 Chalk,
40 1 Arithmetic Pipeline Black
Board,PPT
TB1 Chalk,
41 2 InstructionPipeline Black
Board,PPT
TB1 Chalk,
42 1 RISCPipeline Black
Board,PPT
TB1 Chalk,
43 1 VectorProcessing Black
Board,PPT
TB1 Chalk,
44 1 ArrayProcessor Black
Board,PPT
TB1 Chalk,
MultiProcessors:Characteristics
45 1 Black
of Multiprocessors,
Board,PPT
TB1 Chalk,
46 1 InterconnectionStructures, Black
Board,PPT
TB1 Chalk,
47 1 Interprocessorarbitration Black
Board,PPT
TB1 Chalk,
Interprocessor communication
48 1 Black
and synchronization
Board,PPT
TB1 Chalk,
49 1 CacheCoherence Black
Board,PPT
Total Classes 49
TEXTBOOKS
T1-ComputerSystemArchitecture-M.MorrisMano,ThirdEdition, Pearson/PHI
REFERENCES
R1-ComputerOrganization-CarlHamacher,ZvonksVranesic,SafeaZaky,FifthEdition
R2-StructuredComputerOrganization-AndrewS.Tanenbaum,FourthEdition,PHI/Pearson.
E-RESOURCES
1. http://ndl.iitkgp.ac.in/document/xttk-4kfhvUwVlXBW-WRPf64_TFk2i4LJhgQFPQWAEt-
Zobbm3twyubjRA1YOe9WVwkN2qGcxBwdHaPdi_mMQ
2. https://ndl.iitkgp.ac.in/result?q={"t":"search","k":"object%20oriented%20programming","s"
:["type=\"video\""],"b":{"filters":[]}}
3. http://www.rehancodes.com/files/oop-using-c++-by-joyce-farrell.pdf
4. http://www.nptel.ac.in/courses/106103115/36
FACULTY HOD
PEO’sMappingwithPO’s
ProgramOutcomes(a-i)
ProgramEducationalObjective a b c d e f g h i
PEO I ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
PEO II ✓ ✓ ✓ ✓ ✓ ✓
PEO III ✓ ✓ ✓ ✓
PEO IV ✓ ✓ ✓ ✓ ✓
SubjectMappingwithPEO’s
Thecomponentsofthe curriculumandtheirrelevancetothe POsandthe PEOs
Programmecurriculumgroupingbasedondifferentcomponents
Course CurriculumContent(%ofnumber of Total noof Totalnumber
POs PEOs
Component credits of the programme) contacthours of credits
IIYearI Semester
Professional
ComputerOrganizationand 48 3 a,b,c,f,g,h P1,P2
core
Architecture
SubjectMappingwithPO’s
Core engineering subjects and their relevance to Programme Outcomes including design
experience.
Contributionofcourses
ProgramOutcomes (a-i) Course outcomes
to programoutcomes
IIYEARISEMSTER
CourseNo. &Title a b c d e f g h i
✓ ✓ ✓ ✓ ✓ ✓ UnderstandUnderstandthebasics
ComputerOrganization ✓
of instruction sets and their impact
and Architecture
on processor design.
CourseOutcomes:
Attheend ofthecourse,studentswillbeable to
ProgrammeOutcomes(POs) PSOs
COs
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
CO1 3 2 3 3 3 2 2 3 3
CO2 3 3 1 3 3 2 3 3 3
CO3 3 3 3 3 3 3 3 3
CO4 2 1 3 3 2
CO5 2 3 1
CourseObjectivesandCourseOutcomes
Course Objectives:
The purpose of the course is to introduce principles of computer organization and the basic
architectural concepts.
● It begins with basic organization, design, and programming of a simple digital computer and
introduces simple register transfer language to specify various computer operations.
● Topicsincludecomputerarithmetic,instruction setdesign,microprogrammedcontrolunit,
pipeliningandvectorprocessing,memoryorganizationandI/Osystems,andmultiprocessors
Course Outcomes:
Attheend ofthe course,studentswill beable to
1. UnderstandUnderstandthebasicsofinstructionsetsandtheirimpactonprocessor design..
2. EvaluateDemonstrateanunderstanding ofthedesignofthefunctionalunitsofadigital computer
system.
3. ApplyEvaluatecostperformanceanddesigntrade-offsindesigningandconstructinga computer
processor including memory.
4. AnalyzeDesign apipelineforconsistent execution ofinstructions with minimum hazards.
5. UnderstandRecognizeandmanipulaterepresentationsofnumbersstoredindigital computers
MALLAREDDYENGINEERING COLLEGE
(Autonomous)
Maisammaguda,Dhullapally,PostviaKompally, Secunderabad –
500100.
IIB.TECH–ISemester(MR22)
AssignmentQuestions-I
Subject: Computer Organization and Architecture
Blooms
S.No Question Taxonomy CO
level
4 Explaininstructioncodesandaddressingmodes Evaluate 2
Brieflyexplainfixedpointrepresentationandfloatingpoint Understa 3
5 nd
representation
AssignmentQuestions-II
Bloom’s
S.No Question Taxonomy CO
Level
Explain Decimal Arithmetic unit and Decimal Arithmetic
1. Analyze 3
operations.
2. ExplainhowAsynchronousdatatransferisbeenperformed. Evaluate 4
Listoutdifferentmemoriesandexplainbrieflythememory organization
3. Evaluate 4
4. DifferentiatebetweenCISCandRISC Understand 5
ExplainbrieflyaboutInterprocessorcommunicationand synchronization
5. Understand 5
MALLAREDDYENGINEERING COLLEGE
(Autonomous)
Maisammaguda,Dhullapally,PostviaKompally, Secunderabad
– 500100.
MODULE-I
S.No Question Blooms CO
Taxonomy
level
Module-2
S.No Question Blooms CO
Taxonomy
level
1 Analyze the 3different mapping processes used in cache Analyzing 2
memory organization.
3 InterprettheMemoryConnectiontoCPUbyusingMemory Evaluating 2
Address Mapping of RAM Chip and ROM Chip
b)AssociateMemory
Module-3
S.No Question Blooms CO
Taxonomy
level
1 Writeabrief noteon interprocessor arbitration Analyze 3
2 DifferencebetweenRISCandCISC Evaluate 4
3 WriteaboutInterprocessorcommunicationand Synchronization
Evaluate 4
4 ListouttheimportantstagesofInstructionPipeline Understand 5
6 Whatisparallelprocessing?Explainanyparallelprocessing
Evaluate 4
mechanism
7 Explaintheinterconnectionstructureformultiprocessor systems
Evaluate 4
8 Whatismultiprocessorsystem?Explaintheadvantagesof multi
Understand 5
processors over uniprocessors.
MALLAREDDYENGINEERING COLLEGE
(Autonomous)
Maisammaguda,Dhullapally,PostviaKompally, Secunderabad
– 500100.
UnitIII
2. Designtheflowchartfor BoothsMultiplicationalgorithmwithanexample?
3. WriteaboutComputerArithmeticAdditionwithneatflow chart.
4. IllustrateDecimalArithmeticUnitandOperations
UnitIV
3. InterprettheMemoryConnectiontoCPUbyusingMemoryAddressMappingofRAMChip and
ROM Chip
4. Explaina)AuxiliaryMemory
b)AssociateMemory
5. SketchtheblockdiagramofDMAusingDMAController
6. Explainthedifferenttypesofmodesoftransferindetail
7. Describetheasynchronousdata transfer
Unit V
2. DifferencebetweenRISCandCISC
3. WriteaboutInterprocessorcommunicationand Synchronization
4. ListouttheimportantstagesofInstructionPipeline
5. IllustrateaboutFlynn’sClassificationofparallel processing.
6. Whatisparallelprocessing?Explainanyparallelprocessingmechanism
7. Explaintheinterconnectionstructureformultiprocessorsystems
8. Whatismultiprocessorsystem?Explaintheadvantagesofmultiprocessorsover uniprocessors.
MALLAREDDYENGINEERINGCOLLEGE (AUTONOMOUS)
II B.Tech I Semester (MR-24) 2024-2025 Admitted Batch
Mid Term Examinations-I
Subject Code & Name:-D 0506 & ComputerOrganization andArchitecture
Branch: CSE
Model-1
Howarethesequentialcircuitsspecifiedintermsoftime sequence?
3 D
a)By Inputs b)ByOutputs c)ByInternalstates d)Allof theabove
Whichmemoryelementsareutilizedinanasynchronous&clockedsequentialcircuits
respectively?
5 B
a)Time-delaydevices ®isters b)Time-delaydevices&flip-flops
Thecharacteristicequationof D-flip-flopimpliesthat
b) thenextstateisdependentonpresentstate
c) thenextstate isindependent ofprevious state
WhichcircuitisgeneratedfromD-flip-flopduetoadditionofaninverterbycausing reduction in
the number of inputs?
8 D
a) Gated JK- latch b) Gated SR- latch c) Gated T- latch d)GatedD-
latch
Whatisthemaximumpossiblerangeofbit-countspecificallyinn-bitbinarycounter consisting of
13 'n' number of flip-flops? B
a) 0 to 2n b)0 to2n-1 c)0 to 2n+1 d)0 to 2n+1 / 2
Which propertyof unit distance counters has the potential to overcome the consequences of
multi-bitchangeflashingthatarisesinalmostallconventionalbinaryanddecimalcounters?
14 A
a) onebit changeper unitchange b)two bits changeper unitchange
Whatistherequiredrelationshipbetweennumberofflip-flopsandthetimingsignalsin Johnson
Counter?
Thebus-requestcontrolinputofmicro-processorindicatesthetemporarysuspensionof
18 current operation by driving all buses into . A
a)highimpedancestate b)low impedancestatec)both a&b d)none of theabove
Whichfeatureconductsthememorytransferbycontrollingtheaddressanddatabusesonthe basis of
request originated by the device when buses get disabled by the microprocessor?
19 B
a) IndirectMemoryAccess b)Direct MemoryAccess
Bydefaultcountersareincrementedby
20 A
a)1 b)2 c)3 d)4
Simplestregisters onlyconsists of
21 D
a) Counter b)EPROM c) Latch d)flip-flop
Adecimalcounterhas
23 B
a)5 states b)10 states c)15 states d)20 states
2leftshiftsarereferredtoasmultiplication with
25 B
a)2 b)4 c)8 d)16
Ripplecountersarealsocalled
26 B
a) SSIcounters b)asynchronouscounters c)synchronouscounters d)VLSIcounters
Transformationtoinformationintoregistersiscalled
27 A
a) Loading b) gatedlatch c) Latch d)Storing
Binarycounterthatcountincrementallyanddecrementallyiscalled
28 A
a)up-down counter b) LSIcounters c)down counter d)upcounter
Shiftregistershavingfourbitswillenableshiftcontrolsignalfor
29 C
a)2clock pulses b)3clockpulses c)4clockpulses d)5 clock pulses
Agroup ofbinarycells is called
30 B
a) Counter b)Register c) Latch d)Flip-flop
Synchronouscounterisatypeof
31 C
a) SSIcounters b)LSIcounters c) MSIcounters d)VLSIcounters
BCDcounterisalsoknown as
32 B
a)parallelcounter b) decade counter c)synchronouscounter d) VLSI counter
A8-bitflip-flopwillhave
33 D
a) 2binarycells b)4binarycells c)6binarycells d)8binarycells
Parallelloadtransferisdonein
34 A
a)1 cycle b)2 cycle c)3 cycle d)4 cycle
Ripplecountercannotbe describedby
36 A
a)Booleanequation b)clock duration c) Graph d)flowchart
Parallelloadingisdonein
38 A
a)1 cycle b)2 cycle c)3 cycle d)4 cycle
Controlunitinserial computergeneratesa( B)
39 B
a)resetsignal b)word-timesignal c)word signal d)clear signal
BCDcountercountsfrom
40 C
a)0 to 5 b)1 to 5 c) 0 to 9 d)1 to 9
J=K=0willmakeflip-flops
41 C
a) Changed b)Reversed c) Unchanged d)Stopped
Specialtypeofregistersare
42 C
a) Latch b)Flip-flop c) Counters d)Memory
Flip-flopsinregistersare
43 C
a) Present b)level triggered c) edge triggered d)not present
44 Downcounterdecrementvalue by A
a)1 b)2 c) 3 d)4
Ripplecounterisa typeof
45 C
a) SSIcounters b)LSIcounters c)MSI counters d)VLSI counters
Propagationofsignalthroughcountersisin
46 A
a)ripplefashion b)serial fashion c)parallelfashion d)both aandb
Registershiftingleftand rightbothiscalled
47 a)unidirectionalshiftregister b)bidirectionalshiftregister B
c)leftshiftregister d)rightshiftregister
Adecimalcounterhas
48 C
a)2flip-flops b)3flip-flops c)4 flip-flops d)5 flip-flops
Controlvariableofregistersisalsocalled
49 B
a)storecontrol input b)loadcontrol input c)storecontrol output d)loadcontrol output
Model-2
Fastelectronicmachineacceptsdigitalinputinformationprocessandproduceresulting output is
51 a)AnalogComputer b)Digital Computer c)Workstation d)Super Computer B
ListofInstructionsis
52 A
a)ComputerProgram b) Function c)Procedure d) Sub Routine
InternalStorageis called
53 A
a)ComputerMemory b) Stack c)Queue d)Datastructure
Computerusedinhome,officeandschoolsis
54 C
a)Super Computer b)Mainframe Computer c)Personal Computer d)Client machine
ComputerhavingHighresolutiongraphics I/Ocapability
55 D
a)Desktop Computer b)Digital Computer c)NetworkComputer d) Workstation
57 Computersusedforlarge scalenumericalcalculationsis A
a)Super computers b) Servers c) Mainframe d)NetworkPC
Systemshandlinglargevolumesofrequests toaccessdatais
58 B
a)Super computers b) Servers c)Mainframe d)NetworkPC
unitacceptsinformationfromhumanoperators
60 B
a)Output b)Input c)ALU d)Control Unit
Expand ASCII
a) AmericanStandardCodeforInformationInterchange
62 b) AmericanSocialcodeforInstruction Interchange A
Twoclassesofstorage
63 B
a) Serial, parallel b)Primary,secondary c)Input, output d)ION,IOF
ASCII is a bitcode
64 D
a) 1 b)3 c)5 d) 7
Thetimerequiredtoaccessonewordiscalledthe
67 a) MemoryReadTime b)MemoryWriteTime D
Basicarithmeticoperationsareperformedin
68 B
a)CU b) ALU c)Memory d)Input
sendstheprocessor resultstooutsideworld
69 A
a)OutputUnit b)InputUnit c) Memory Unit d)Control Unit
determineswhena givenactionhastotake place
70 C
a)Clock b)Interval c)Timing Signal d) Pulse
Agroupoflinesthat serveasconnectingpathforseveraldevices is
71 B
a)Cable b) Bus c)Wire d)Line
isacollectionofprograms
73 B
a)Hardware b)SystemSoftware c)Circuitry d) Directory
ExpandSCSI
Importantmeasureofcomputeris
75 C
a)System Software b)SystemHardware c)Performance d) Bus
ProcessorCircuitsarecontrolledbyatimingsignalcalled a
76 A
a)Clock b) Pulse c)FlipFlop d) Registers
Overlappingtheexecutionofsuccessiveinstructionscalled
78 B
a)VectorProcessing b) Pipelining c) ArrayProcessing d)CentralProcessing
79 a)3combinationalinputs b)4combinationalinputs D
translateshighlevellanguageto machinelanguage
80 C
a)Assembler b) Router c)Compiler d)Interpreter
holdtheaddresswhichis tobeaccessed
81 D
a)MDR b) PC c) IR d) MAR
Operationsexecutedondatastoredregistersare
82 a) Micro operations b) Mini operaions c)Largescaleoperationsd)Smallscale A
operations
micro operations areperformed on numeric data stored in registers
83 B
a)Register transfer b)Arithmetic c) Logic d) Shift
microoperationsperformbitmanipulationoperationsonnonnumericdata
84 stored in registers C
a)Register transfer b) Arithmetic c) Logic d) Shift
85 a)Registertransfermicrooperations b)Arithmeticmicrooperations B
operationsetsto1thebitsinregisterAwherethe corresponding1'sinregister B
88 B
a)Selective Clear b)SelectiveSet c)Selective Complement d)Selective Reset
operationcomplementsbitsinregisterAwheretherearecorresponding1'sin
89 register B C
a)Selective Clear b)SelectiveSet c)Selective Complement d)Selective Reset
operationclearstozerothebitsinAonlywheretherearecorresponding1'sin register
90 B A
a)Selective Clear b)SelectiveSet c)SelectiveComplement d)Selective Reset
Expand PC
94 A
a)Program Counter b)ProcessCounter c)Program Circuit d)ParityCounter
ExpandIR
95 B
a)Interrupt Register b)Instruction Register c)IsolatedRate d)Integrated
Route
Expand MDR
c) MemoryDualRegister d)MemoryDataRegister
Agroupofbitsthattell thecomputertoperformaspecificoperationisknown as .
99 A
a)Instructioncode b)Micro-operation c)Accumulator d)Register
Thetimeintervalbetweenadjacentbitsiscalledthe
100 B
a)Word-time b) Bit-time c)Turnaroundtime d)Slicetime
Model-3
Thetypeofcontrolsignalare generatedbasedon
Whatdoesthehardwiredcontrolgeneratorconsistof?
110 a) Decoder/encoder b) Condition codes c) Control step counter d)Allofthe D
above
Whatdoestheendinstruction do?
Thedisadvantage/softhehardwiredapproachis
Whatisthecontrol unit'sfunctionintheCPU
118 Whichregisterisusedtogeneratethedifferentcontrolsignals? D
a)PC b) MAR c)MBR d)IR
Whichregisterisusedtoholdtheaddresswheneitherreadingor writing?
119 B
a)PC b) MAR c)MBR d)IR
Control memory is
120 B
a)RAM b) ROM c)Virtual memory d)Cache memory
c)Lesserrorprone,slower d)Faster,hardertochange
Amicro-programmedcontrolunit
123 b) facilitateseasyimplementationofnewinstructions B
Controlprogrammemorycan bereduced by
Hardwiredcontrol is usuallydonein
125 A
a)RISC architecture b)CISC architecture c)Bothaandb d)Noneofabove
MALLAREDDYENGINEERINGCOLLEGE (AUTONOMOUS)
II B.Tech I Semester (MR24) 2024-2025 AdmittedBatch
Mid Term Examinations-II
Subjec tCode & Name:-D0506 &ComputerOrganization andArchitecture
Branch:CSE
OBJECTIVEQUESTIONBANK
1 Thefastestdataaccessisprovidedusing . D
Caches
DRAM’s
SRAM’s
Registers
2 Theeffectivenessofthecachememoryisbasedonthepropertyof A
Localityof reference
Memorylocalization
Memorysize
3 Thecorrespondencebetweenthemainmemoryblocksandthoseinthecacheisgivenby C
.
Hashfunction
Locale function
Mappingfunction
Assignfunction
4 Thealgorithmtoremoveandplacenewcontentsintothecacheiscalled A
Replacementalgorithm
Updation
Renewalalgorithm
5 Thebitusedtosignifythatthecachelocationisupdatedis . A
Dirtybit
Updatebit
Referencebit
Flagbit
6 Thelastonthehierarchyscaleofmemory devicesis . B
Mainmemory
Secondarymemory
TLB
Flashdrives
7 Inmemoryinterleaving,thelowerorderbitsoftheaddressisusedto C
Getthedata
Gettheaddressofthedatawithinthemodule
Gettheaddressofthemodule
8 Thenumberofsuccessfulaccessestomemorystatedasafractioniscalledas . A
Hitrate
Missrate
Successrate
Accessrate
9 Thenumberoffailedattemptstoaccessmemory,statedintheformoffractioniscalledas B
.
Hitrate
Missrate
Failurerate
Delayrate
Delay
Miss
Hit
Delayedhit
11 Theextratimeneededtobringthedataintomemoryincaseofamissiscalledas . C
Delay
Propagationtime
Miss penalty
12 Thekeyfactor/sincommercialsuccessofacomputeris/are . D
Performance
Cost
Speed
Bothaandb
13 Themainobjectiveofthecomputersystemis B
Toprovideoptimalpoweroperation.
Toprovidebestperformanceatlowcost.
Toprovidespeedyoperationatlowpowerconsumption.
Alloftheabove.
14 Themainpurposeofhavingmemoryhierarchyisto D
Reduceaccesstime.
Providelargecapacity.
Reducepropagationtime.
Bothaandb.
15 Theprogramisdividedintooperablepartscalledas . B
Frames
Segments
Pages
Sheets
16 Thetechniqueswhichmovetheprogramblockstoorfromthephysicalmemoryiscalledas B
.
Paging
Virtualmemoryorganization
Overlays
Framing
17 Thebinaryaddressissuedtodataorinstructionsarecalledas . D
Physicaladdress
Location
Relocatableaddress
Logicaladdress
18 isusedtoimplementvirtualmemoryorganization. C
Pagetable
Frametable
MMU
19 translateslogicaladdressintophysicaladdress. A
MMU
Translator
Compiler
Linker
20 Themainaimofvirtualmemoryorganizationis D
Toprovideeffectivememoryaccess.
Toprovidebettermemorytransfer.
Toimprovetheexecutionoftheprogram.
Alloftheabove.
21 Thevirtualmemorybasicallystoresthenextsegmentofdatatobeexecutedonthe A
.
Secondarystorage
Disks
RAM
ROM
22 Forthesynchronizationofthereadhead,wemakeuseofa . C
Framing bit
Synchronizationbit
Clock
Dirtybit
23 Floating-pointnumbersarenormallyamultiplesofsizeof C
Bit
Nibble
Word
Byte
24 A4digitBCDnumbercanberepresentedwiththehelpof B
10 bits
16 bits
8 bits
12 bits
25 Anyelectronicholdingplacewheredatacanbestoredandretrievedlaterwheneverrequir A
ed is
Memory
Drive
Disk
Circuit
26 Thelogicoperationsareimplementedusing circuits. C
Bridge
Logical
Combinational
Gate
27 Thecarrygenerationfunction:ci+1=yici+xici+xiyi,isimplementedin B
Halfadders
Fulladders
Rippleadders
Fast adders
28 Thecarryintherippleadders,(whichistrue) C
Aregeneratedatthebeginningonly.
Isgeneratedattheendofeachoperation.
Musttravelthroughthe configuration.
29 Infulladdersthesumcircuitisimplementedusing gates. C
AND&OR
NAND
XOR
XNOR
30 Theusualimplementationofthecarrycircuitinvolves gates. B
AND&OR
XOR
NAND
XNOR
31 A gateisusedtodetecttheoccurrenceofanoverflow. B
NAND
XOR
XNOR
AND
32 Inanormaladdercircuitthedelayobtainedingenerationoftheoutputis A
2n+2
2n
n+ 2
33 Thefinaladditionsumofthenumbers,0110&0110is A
1101
1111
1001
1010
34 Thedelayreducedtointhecarrylookaheadadderis . A
5
8
10
2n
Flipflops
Combinatorial
Fast adders
36 Themultiplierisstoredin . B
PC Register
Shiftregister
Cache
37 The isusedtoco-ordinatetheoperationofthemultiplier. C
Controller
Coordinator
Controlsequencer
38 Themultiplicandandthecontrolsignalsarepassedthroughtothen-bitaddervia . A
MUX
DEMUX
Encoder
Decoder
39 Themethodusedtoreducethemaximumnumberofsummandsbyhalfis . B
Fast multiplication
Bit-pairrecording
Quickmultiplication
40 CSAstandsfor A
ComputerSpeedAddition
ComputerServiceArchitecture
CarrySave Addition
Noneof theabove
41 Thenumberswrittentothepowerof10intherepresentationofdecimalnumbersare called C
as .
Heightfactors
Sizefactors
Scalefactors
Orthogonal
Normalized
Determinate
43 constitutetherepresentationofthefloatingnumber. D
Sign
Significantdigits
Scalefactor
Allofthe above
44 Thesignfollowedbythestringofdigitsiscalledas . C
Significant
Determinant
Mantissa
Exponent
45 InIEEE32-bitrepresentations,themantissaofthefractionissaidtooccupy bits. B
24
23
20
16
46 The32bitrepresentationofthedecimalnumberiscalledas . B
Double-precision
Single-precision
Extendedformat
47 In32bitrepresentationthescalefactorisarangeof . A
-128to127
-256to255
0to255
48 Whentheprocessorexecutesmultipleinstructionsatatimeitissaidtouse D
Singleissue
Multiplicity
Visualization
Multipleissue
49 The playsaveryvitalroleincaseofsuperscalarprocessors. A
Compilers
Motherboard
Memory
Peripherals
50 Insuper-scalarprocessors, modeofexecutionisused. C
In-order
Postorder
Out of order
51 Inmemory-mappedI/O… A
TheI/Odevicesandthememorysharethesameaddressspace
TheI/Odeviceshaveaseparateaddressspace
ThememoryandI/Odeviceshaveanassociatedaddressspace
ApartofthememoryisspecificallysetasidefortheI/Ooperation
52 TheusualBUSstructureusedtoconnecttheI/Odevicesis B
StarBUSstructure
SingleBUS structure
MultipleBUSstructure
NodetoNodeBUSstructure
53 TheadvantageofI/Omappeddevicestomemorymappedis C
Theformeroffersfastertransferof data
ThedevicesconnectedusingI/Omappinghaveabiggerbufferspace
Thedeviceshavetodealwithfeweraddresslines
Noadvantageas such
54 ToovercomethelagintheoperatingspeedsoftheI/Odeviceandtheprocessorweuse B
Buffer spaces
Statusflags
Interruptsignals
Exceptions
55 ThemethodofaccessingtheI/Odevicesbyrepeatedlycheckingthestatusflagsis A
Program-controlledI/O
I/Omapped
Memory-mappedI/O
None
56 ThemethodwhichoffershigherspeedsofI/Otransfersis D
Interrupts
Memorymapping
Program-controlledI/O
DMA
57 Theprocesswhereintheprocessorconstantlychecksthestatusflagsiscalledas A
Polling
Inspection
Reviewing
Echoing
Delimiter
Sizeindicatormnemonic
Specialassemblers
59 Thestartingaddressisdenotedusing directive C
EQU
ORIGIN
ORG
PLACE
60 Theconstantcanbedeclaredusing directive D
DATAWORD
PLACE
CONS
DC
61 Toallocateablockofmemoryweuse directive B
RESERVE
DS
DATAWORD
PLACE
62 TheBranchinstructionin68000provideshowmanytypesofoffsets? D
63 TheDMAtransfersareperformedbyacontrolcircuitcalledas B
Device interface
DMAcontroller
Datacontroller
Overlooker
64 InDMAtransfers,therequiredsignalsandaddressesaregivenby the C
Processor
Devicedrivers
DMAcontrollers
Theprogram itself
65 TheDMAcontrollerhas registers C
66 Thecontrollerisconnectedtothe B
ProcessorBUS
SystemBUS
ExternalBUS
67 Thetechniquewherethecontrollerisgivencompleteaccesstomainmemoryis D
Cyclestealing
Memorystealing
MemoryCon
Burst mode
68 ToovercometheconflictoverthepossessionoftheBUSweuse B
Optimizers
BUSarbitrators
MultipleBUSstructure
64 bits
24 bits
32 bits
16 bits
70 TheDMAtransferisinitiatedby C
Processor
Theprocessbeingexecuted
I/O devices
OS
71 interruptmethodusesregisterwhosebitsaresetseparatelybyinterruptsignalforeach A
device
Parallelpriorityinterrupt
Daisy chaining
Serialpriority interrupt
72 registerisusedforthepurposeofcontrollingthestatusofeachinterrupt D
requestinparallel priority interrupt
Mass
Mark
Make
Mask
73 Interruptsinitiatedbyaninstructioniscalledas B
Internal
External
Hardware
Software
74 Thesignalsthatareprovidedtomaintainproperdataflowand synchronizationbetweenthe A
data transmitter and receiver are
Handshakingsignals
Controlsignals
Inputsignals
75 Theexampleofoutputdeviceis D
CRT display
7-segmentdisplay
Printer
Allofthementioned
76 havebeendevelopedspecificallyforpipelinedsystems. C
Utilitysoftware
Speeduputilities
Optimizingcompilers
77 Thepipeliningprocessisalsocalledas . B
Superscalaroperation
Assemblylineoperation
VonNeumanncycle
78 Thefetchandexecutioncyclesareinterleavedwiththehelpof . C
Modificationinprocessorarchitecture
Specialunit
Clock
Controlunit
79 Eachstageinpipeliningshouldbecompletedwithin cycle. A
Virtual
Secondary
Cache
81 Ifaunitcompletesitstaskbeforetheallottedtimeperiod,then C
It’llperformsomeothertaskintheremainingtime
Itstimegetsreallocatedtodifferenttask
It’llremainidlefortheremainingtime
82 Toincreasethespeedofmemoryaccessinpipelining,wemakeuseof C
Specialmemorylocations
Specialpurposeregisters
Cache
Buffers
83 Whichofthefollowingisindependentoftheaddress bus? A
Secondarymemory
Mainmemory
Onboardmemory
Cachememory
84 TheiconicfeatureoftheRISCmachineamongthefollowingis C
Reducednumberofaddressingmodes
Increasedmemorysize
Havingabranchdelayslot
Allofthementioned
85 BoththeCISCandRISCarchitectureshavebeendevelopedtoreduce the C
Cost
Timedelay
Semanticgap
Allofthementioned
86 Whichcontrolreferstothetrackoftheaddressof instructions C
Datacontrol
Registercontrol
Programcontrol
Noneof these
87 Inprogramcontroltheinstructionissetforthestatement in: B
Parallel
Sequence
Both
None
88 SIMDstandsfor: D
Systeminstructionmultipledata
Scaleinstructionmultiple data
Symmetricinstructionmultipledata
Singleinstructionmultipledata
89 MIMDstandsfor: C
Multipleinputmultipledata
Memoryinputmultipledata
Multipleinstructionmultipledata
Memoryinstructionmultipledata
90 Whichisamethodofdecomposingasequentialprocessintosuboperations? A
Pipeline
CISC
RISC
Database
91 Whicharethetypesofarrayprocessor? C
Attachedarrayprocessor
SIMDarrayprocessor
Both
None
92 Whichtypeofregisterholdsasinglevectorcontainingatleasttworeadportsandonewriteports D
Data system
Database
Memory
Vectorregister
93 Whichisusedtospeed-uptheprocessing: C
Pipeline
Vectorprocessing
Both
None
94 Whichprocessorisaperipheraldeviceattachedtoacomputersothattheperformanceofacompute A
r can be improved for numerical computations?
Attachedarrayprocessor
SIMDarrayprocessor
Both
None
95 Whichprocessorhasasingleinstructionmultipledatastreamorganizationthat B
manipulates the common instruction by means of multiple functional units?
Attachedarrayprocessor
SIMDarrayprocessor
Both
None
96 Processorwithoutstructuralhazardis A
Faster
Slower
Havelongerclockcycle
Havelargerclock rate
97 Simplestschemetohandlebranchesisto D
Flushpipeline
Freezingpipeline
Depthofpipeline
Bothaandb
98 Splittingcacheintoseparateinstructionsanddatacachesorbyusingasetofbuffers,usually C
called
Cachebuffer
Data buffer
Instructionbuffer
Noneof above
99 WithseparateadderandabranchdecisionmadeduringID,thereisonlya A
1-clock-cyclestallonbranches
2-clock-cyclesstallonbranches
4-clock-cyclesstallonbranches
3-clock-cyclesstallonbranches
Pipelineinterlock
Deadlock
Stallinterlock
Stalldeadlock
101 Ifeventoccursatsameplaceeverytimeprogramisexecutedwithsamedataandmemoryallocation, B
then event is known as
Stalled
Synchronous
Delayed
Asynchronous
102 Pipelineoverheadarisesfromcombinationofpipelineregisterdelayand D
Hitrate
Clockcycle
Cyclerate
Clockskew
103 Eachofclockcyclesfromprevioussectionofexecution,becomesa A
Pipestage
Previousstage
Stall
Processorcycle
Synchronous
Asynchronous
Pipelined
Blocked
105 Whencompilerattemptstoscheduleinstructionstoavoidhazard;thisapproachiscalled D
Compiler
Static scheduling
Dynamicscheduling
Bothaandb
106 PipeliningincreasesCPUinstruction B
Size
Throughput
Cyclerate
Time
107 Sumofcontentsofbaseregisterandsign-extendedoffsetisusedasamemoryaddress,sum is C
known as
ALU instructions
Throughput
Effectiveaddress
Loadandstore instructions
Instructionissue
Nullifying
Branchprediction
109 Ifanexceptionisraisedandthesucceedinginstructionsareexecutedcompletely,thentheproces B
sor is said to have
Exceptionhandling
Impreciseexceptions
Errorcorrection
110 Theproductof1101&1011is A
10001111
10101010
11110000
11001100
111 Codecontainingredundantloads,stores,andotheroperationsthatmightbeeliminatedbyan D
optimizer, is
Optimizedclock
Unoptimizedclock
Optimizedcode
Unoptimizedcode
112 Delaysarisingfromuseofaloadresult1or2cyclesafterloads,refersas D
Data stall
Controlstall
Branchstall
Loadstall
113 Situationsthatpreventnextinstructionininstructionstream,fromexecutingduringitsdesignated C
clock cycle are known
Pipestage
Previousstage
Hazards
Processorcycle
114 Theproductof-13&11is B
1100110011
1101110001
1010101010
1111111000
115 Themethodusedtoreducethemaximumnumberofsummandsbyhalfis B
Fast multiplication
Bit-pairrecording
Quickmultiplication
116 Thedigitalinformationisstoredontheharddiskby A
Applyingasuitableelectricpulse
Applyingasuitablemagneticfield
Applyingasuitablenuclearfield
Byusingopticwaves
10
20
Center
Middle
Fromthelastusedpoint
Boundaries
119 Theassociativelymappedvirtualmemorymakesuse of A
TLB
Pagetable
Frametable
120 IftheinstructionAddR1,R2,R3isexecutedinasystemwhichispipelined,thenthevalueof S is C
(Where S is a term of the Basic performance equation)
~2
~1
121 Two processors A and B have clock frequencies of 700 Mhz and 900 Mhz respectively. A
Suppose Acanexecuteaninstructionwithanaverageof3stepsandBcanexecutewithan average
of 5 steps. For the execution of the same instruction which processor is faster?
Bothtakethesametime
Insufficientinformation
122 Aneffectivetointroduceparallelisminmemoryaccessisby A
Memoryinterleaving
TLB
Pages
Frames
123 Theperformancedependson B
Thespeedofexecutiononly
Thespeedoffetchand execution
Thespeedoffetchonly
Thehardwareofthesystemonly
124 Acommonmeasureofperformanceis A
Price/performanceratio
Performance/priceratio
Operation/priceratio
Noneof the above
-1
+1
Both-1and0
MALLAREDDYENGINEERINGCOLLEGE(AUTONOMOUS
) II B.Tech I Semester (MR 21) REGULAR END
EXAMINATION MODEL QUESTION PAPER
ComputerOrganizationandArchitecture
(CommontoCSE,IT,CSE(CS),CSE(AIML),CSE(DS)CSE(IOT))
Duration:3 Hours Max.Marks:70
2
(a)DescribeMicroProgramwithan example 6 L1
3.
(b)Usedifferentinstruction formatand evaluatetheexpression Y=(A-B)/(C+D*E) 8 L3 2
OR
(a)Explainaboutaddress sequencingcapabilitiesincontrolmemory. 7 L1 2
4.
(b)DemonstrateallAddressingModes withanNumerical Example? 7 L3 2
(a)Designthe flowchart foradditionandsubtractionoperationswith example. L3 3
7
5
(b)EvaluatemultiplicationoftwonumbersusingMultiplicationalgorithmwitha
7 L3 3
numericalexample.
OR
(a)UsetheflowchartfordivisionalgorithmandsolveAQ=0111000000dividedby L3
7 3
B=10001
6.
(b)Designtheflow chartforBoothsMultiplicationalgorithmwithanexample. 7 L2 3
Analyzethe 3 differentmappingprocesses usedin cachememoryorganization.
7. 14 L3 4
OR
(a)Sketch theblock diagramof DMA usingDMAController
7 L3 4
8.
(b)Analyzethe 3 different mappingprocesses used in cachememoryorganization. L3
7 4
9 (a)DifferencebetweenRISCandCISC 7 L2 5
(b)ListouttheimportantstagesofInstructionPipeline 7 L1
OR
(a) IllustrateaboutFlynn’sClassificationofparallelprocessing. 5
7 L2
10. (b)Writeabout InterprocessorcommunicationandSynchronization L2
7 5
Digital: The word digital implies that the information in the computer is represented by variables that take a
limitednumberof discretevalues.Thesediscretevaluesareprocessedinternally by componentsthatcan maintain
alimitednumberofdiscretestates.(0,1,…9provide10discretevalues)
Firstelectronicdigitalcomputerdevelopedin1940swhichwasusedprimarilyfornumericalcomputations
(usingdiscreteelementsi.e.,digits).Fromthisapplicationthetermdigitalcomputerhasemerged.
Digital components that are constrained to take discrete values are further constrained to take only two values-
binary 0/1.
Informationindigitalcomputersisrepresentedbyagroupofbits(binarydigitiscalledabit).
By using various coding techniques, grouos of bits can be made to represent not only binary numbers but
alsootherdiscretesymbols-decimaldigits,lettersof alphabets.
By use of binary arrangements and by using various coding tecniques, the groups of bits are used to develop
computersetsofinstructionsforperformingvarioustypesofcomputations.
BLOCKDIAGRAMOFDIGITALCOMPUTER
Acomputersystemissubdividedintotwofunctionalentities:hardware,software.
MemoryUnit(MU):Containsstorageforinstructionsanddata.ROM(Read-OnlyMemory)isapartof memory
unit.MUisalsocalledasPrimarymemory/Internalmemory/Principalmemory.
RandomAccessMemory(RAM):usedforreal-timeprocessingofthedata.Itallowsyourcomputerto
switchbetweenprogramsandhavelargefiles readytoview.
Eg:keyboard,mouse,terminals,magneticdiskfrives,andothercommunicationdevices.
COMPUTERORGANIZATION
It dealswith the internal viewof the computerandthe rolesthat theinternal componentsplay duringthe
executionofaprogram.Itincludestheorganizationofmajorpartsofacomputersuchastheprocessor, memory and
peripheral devices.
Processororganization:Itdealswithmaincomponentsofaprocessor,howtheseareinterconnected andhow
these operate execution of an instruction.
Memory organization: The memory organization of a memory unit deals with how its different
components are structured and interconnected.
COMPUTERARCHITECTURE
Itdealswiththeexternalviewofacomputer,thatis,itisconcernedwiththestructureandbehaviorofthe computerviewedby
ausersuchasassemblylanguageprogrammerormachinelanguageprogrammer.
Specificinstructionssupportedbytheprocessor(calledasprocessorinstructionset),
Instructionformats,
Specificregistersandtheirroles,
Thetechniquesforaccessingthedatastoredinthememory,
Thewaytoperforminput-outputoperations.
Systemprogramisdirectlyinteractingwiththecomputerhardware.Theyarewrittenforspecificcomputer architecture.
Eg:OperatingSystems,devices,drivers,compilers,etc.
Application programs invoke the services offered by the system programs. Theyare independent of the
architectureandareconvertedtomachine-dependentprogramsthroughasystemsuchasacompiler.
LOGICGATESUSEDINDIGITALCOMPUTER
Binaryinformationisrepresentedindigital computersby physical quantitiescalledsignals.Electrical signals such as
voltages exist through out the computers in either one of the two recognizable state. The two states represent
abinaryvariablethatcan beequal to 1or 0.
Eg:Adigitalcomputerutilizeasignalof3voltstorepresentbinary1and0.5voltstorepresentbinary
0. Theinputterminalofdigitalcircuitswillacceptbinarysignalsofonly3and0.5voltstorepresentbinary input
andoutputcorrespondingto 1or 0 resp.
At the corelevel, computer communicatesin theformof 0or 1, which isnothing butlowandhigh voltage signals.
The manipulation of binary information is done by logic circuits called gates. Gates are blocks of
hardware that produce signals of binary1 or 0 when input logic requirements are satisfied.
LOGICGATES:
Binarylogicdealswithbinaryvariablesandwithoperationsthatassumealargemeaning.Itisusedto
describeinalgebraic,ortabularform,themanipulationdoneby logiccircuitscalledGates.
Gates are blocks of hardware that produce graphic symbol and it’s operation can be described by means of an
algebraicexpression.Theinput-outputrelationshipofthebinaryvariablesforeachgatecanberepresentedby a Truth-
Table.
ListofLogicGates:
1. AND
2. OR
3. NOT
4. NAND
5. NOR
6. XOR
7. XNOR
REGISTERTRANSFERANDMICROOPERATIONS
RegisterTransferLanguage
RegisterTransfer
BusAndMemoryTransfers
TypesofMicro-operations
ArithmeticMicro-operations
LogicMicro-operations
ShiftMicro-operations
ArithmeticLogicShiftUnit
BASICDEFINITIONS:
Adigitalsystemisaninterconnectionofdigitalhardwaremodules.
Themodulesareregisters,decoders,arithmeticelements,andcontrollogic.
The variousmodules are interconnectedwith common data andcontrol paths to form a digital
computer system.
Digitalmodules arebestdefinedby the registersthey contain and theoperations that areperformed on the
data stored in them.
Theoperationsexecutedondatastoredinregistersarecalledmicrooperations.
A microoperationis an elementaryoperation performedon the informationstoredin oneormore
registers.
The resultof the operationmay replace the previous binary informationof aregisterormay be transferred to
another register.
Examplesofmicrooperationsareshift,count,clear,andload.
Theinternalhardwareorganizationofadigitalcomputerisbestdefinedbyspecifying:
1. Thesetofregistersitcontainsandtheirfunction.
REGISTERS:
Figure4-1showstherepresentationofregistersinblockdiagramform.
REGISTERTRANSFER:
Informationtransferfromoneregistertoanotherisdesignatedinsymbolicformbymeansofa
replacementoperator.
Pisthecontrolsignalgeneratedbyacontrolsection.
We can separate the controlvariablesfrom the registertransferoperation by specifying aControl Function.
Control functionis a Boolean variable thatisequal to 0or 1.
controlfunction is includedin the statementas
P:R2←R1
Control conditionis terminatedby acolon impliestransferoperation beexecutedby thehardware only if
P=1.
Every statement written in a registertransfernotation implies a hardware constructionfor
implementing the transfer.
Figure4-2showstheblockdiagramthatdepictsthetransferfromR1toR2.
ThenoutputsofregisterR1areconnectedtotheninputsofregisterR2.
The lettern willbe usedtoindicate anynumberof bitsfortheregister.Itwill be replacedby an
actualnumberwhen thelengthof the registeris known.
RegisterR2hasaloadinputthatisactivatedbythecontrolvariableP.
It is assumedthat thecontrolvariable issynchronizedwith thesameclock as theone applied to the
register.
As shownin the timingdiagram,P isactivatedinthecontrolsection by therisingedgeofaclock pulse at
time t.
Thenextpositivetransitionofthe clock at timet +1findsthe load input active andthe datainputsof R2 are
then loaded into the registerin parallel.
Pmay go backto 0at timet+1;otherwise, the transferwill occurwith every clockpulsetransition while P
remains active.
Even though thecontrol conditionsuch as Pbecomes active just aftertimet,the actual transferdoes
notoccuruntiltheregisteristriggeredby thenextpositivetransitionoftheclockattime
t+1.
Thebasicsymbolsoftheregistertransfernotationarelistedinbelowtable
Symbol Description Examples
A comma is used to separate two ormore operations that are executed at the same
time.Thestatement
T:R2←R1,R1←R2 (exchangeoperation)
denotes an operation thatexchanges thecontentsof two rgistersduringonecommon clockpulse
provided that T=1.
BusandMemoryTransfers:
The busconsists of four 4 x 1multiplexers each having four datainputs, 0 through 3, and twoselection
inputs, S1 and S0.
Forexample,output 1of registerA is connectedto input 0of MUX 1 because thisinputis labelled A1.
Thediagramshowsthatthebits in the samesignificantposition in each register areconnected to the
datainputsof onemultiplexerto formone lineof the bus.
ThusMUX 0multiplexes thefour 0 bitsof the registers,MUX 1multiplexesthe four 1bitsof the
registers,and similarly for the other two bits.
ThetwoselectionlinesSiandSoareconnectedtotheselectioninputsofallfourmultiplexers.
Theselection lineschoose thefourbitsof one register andtransfer theminto thefour-linecommon bus.
When S1S0=00, the 0datainputsofallfourmultiplexers are selectedand appliedto theoutputsthat form the
bus.
Thiscauses thebuslines to receive thecontentof registerAsincetheoutputsof this registerare connectedto
the 0datainputsof themultiplexers.
Similarly,registerBisselectedifS1S0=01,andsoon.
Table 4-2shows the register thatisselectedbythebusforeachof thefourpossiblebinary valueof the
selection lines.
Ingeneralabussystemhas
multiplex“k”Registers
eachregisterof“n”bits
toproduce“n-linebus”
no.ofmultiplexersrequired=n
sizeofeachmultiplexer=kx1
Whenthebusisincludesinthestatement,theregistertransferissymbolizedasfollows:
BUS←C,R1←BUS
The content of register C is placed on the bus, and the content of the bus is loaded into register R1 by
activatingitsloadcontrolinput.Ifthebusisknowntoexistinthesystem,itmaybeconvenientjust to show the direct
transfer.
R1←C
THREE-STATEBUSBUFFERS:
Itisdistinguishedfromanormalbufferbyhavingbothanormalinputandacontrolinput.
Thecontrol inputdeterminestheoutputstate.Whenthecontrol inputisequal to1,theoutputis
enabledandthegatebehaveslikeanyconventionalbuffer,withtheoutputequal to thenormalinput.
Whenthecontrolinputis0,theoutputisdisabledandthegategoestoahigh-impedancestate, regardless of
the value in the normal input.
Theconstructionofabussystemwiththree-statebuffersisshowninFig.4
Theoutputsoffourbuffersareconnectedtogethertoformasinglebusline.
Thecontrolinputstothe buffers determine which ofthe four normalinputs willcommunicate withthe bus
line.
Nomore thanonebuffermay bein theactivestate at any given time. Theconnectedbuffersmustbe
controlledsothatonlyonethree-statebufferhasaccesstothebuslinewhileallotherbuffersare maintained in a
high impedancestate.
Oneway to ensurethatnomore thanonecontrol input is active atany giventime isto use adecoder, as
shown in the diagram.
When theenableinputofthedecoderis0, allof itsfouroutputsare 0, andthe buslineisin ahigh-
impedancestatebecause allfourbuffers aredisabled.
When theenableinputisactive,oneof thethree-statebufferswillbe active,dependingon thebinary value in the
select inputs of the decoder.
MemoryTransfer:
Thetransferofinformationfromamemorywordtotheoutsideenvironmentiscalledaread
operation.
Thetransferofnewinformationtobestoredintothememoryiscalledawriteoperation.
AmemorywordwillbesymbolizedbytheletterM.
Theparticularmemoryword amongthemany availableis selectedby thememory addressduring the transfer.
It isnecessary tospecify the addressofMwhenwritingmemorytransferoperations.
Thiswillbedoneby enclosingthe addressinsquarebracketsfollowingtheletterM.
Consideramemory unit that receives the addressfrom a register,calledthe address register, symbolized
by AR.
Thedataare transferredto anotherregister,calledthedataregister,symbolizedby DR.
Read:DR<-M[AR]
Thiscauses atransferof information into DRfromthe memory wordMselectedby the address in AR.
Thewriteoperation transfers the contentof adataregister to amemorywordMselectedby the
address.AssumethattheinputdataareinregisterR1andtheaddress isinAR.
Thewriteoperationcanbestatedasfollows:
Write:M[AR]<-R1
TypesofMicro-operations:
RegisterTransferMicro-operations:Transferbinaryinformationfromoneregisterto another.
ArithmeticMicro-operations:Performarithmeticoperationonnumericdatastoredin registers.
Logical Micro-operations:Perform bit manipulation operations on datastoredin registers.
ShiftMicro-operations:Performshiftoperationsondatastoredinregisters.
Re nmovesfromsourceregistertodestinationregister.
gist
Other three types of micro-operations change the information change the
er
information content duringthe transfer.
Tra
nsf
er ArithmeticMicro-operations:
Mi
cro Thebasicarithmeticmicro-operationsare
ope
o Addition
rati
o Subtraction
on
o Increment
doe
o Decrement
sn’
o Shift
t
The arithmetic Micro-operation defined by the statementbelow specifies the add
cha micro- operation.
nge R3 ← R1+ R2
the
It states thatthecontentsof R1are added to contentsof R2 andsum is transferred to R3.
inf
To implement thisstatementhardware requires 3 registers and digital
or
componentthatperformsaddition
ma
Subtraction is most often implemented through
tio
complementation and addition.The subtract operation is
n
specified by the following statement
con
R3← R1 + R2+1
ten
instead of minus
t operator, we can write as
wh R2 is thesymbol for the1’s
en complement of R2
the Adding1to1’scomplementproduces2’scomplement
bin
AddingthecontentsofR1tothe2'scomplementofR2isequivalenttoR1-R2.
ary
inf
or
ma
tio
Unit-1:REGISTERTRANSFERANDMICROOPERATIONS
CONTEN
TS:
RegisterTransfer Language
RegisterTransfer
BusAndMemoryTransfers
TypesofMicro-operations
ArithmeticMicro-operations
LogicMicro-operations
ShiftMicro-operations
ArithmeticLogicShift Unit
BASICDEFINITIONS:
Adigitalsystemisan interconnectionofdigitalhardwaremodules.
Themodulesareregisters,decoders, arithmeticelements, andcontrollogic.
Thevarious modulesare interconnectedwithcommondataandcontrolpathstoformadigital
computer system.
Digitalmodulesarebest definedbytheregisterstheycontainandthe
operations that are performed on the data stored in them.
Theoperationsexecutedondatastoredinregistersarecalled microoperations.
Amicrooperation isanelementaryoperationperformed ontheinformationstoredinoneor more
registers.
The result of the operation may replace the previous binary
informationofaregisterormaybetransferred toanotherregister.
Examplesofmicrooperationsareshift,count,clear, and load.
Theinternalhardwareorganizationofadigitalcomputerisbest defined by
specifying:
1. Thesetofregistersitcontainsandtheir function.
2. Thesequenceofmicrooperationsperformedonthebinaryinformation
stored in the registers.
3. Thecontrolthat initiatesthesequenceofmicrooperations.
REGISTERTRANSFERLANGUAGE:
Thesymbolicnotationusedto describethemicro-operationtransferamongregistersiscalledRTL
(Register Transfer Language).
Theuseofsymbols insteadofanarrativeexplanation providesanorganizedandconcise manner for
listing the micro-operation sequences in registers and the control functions that initiate them.
Aregistertransfer language isasystemforexpressing insymbolic formthe microoperation
sequences among the registers of a digital module.
Itisaconvenient toolfor describingthe internalorganizationofdigitalcomputers inconciseand
precise manner.
Registers:
Computerregistersaredesignatedbyuppercase letters(andoptionallyfollowedbydigitsor
letters) to denote the function of the register.
Forexample,theregisterthat holdsanaddressforthememoryunit isusuallycalleda memory
address register and is designated by the name MAR.
OtherdesignationsforregistersarePC(forprogramcounter),IR(forinstructionregister,andR1
(forprocessorregister).
The individualflip-flopsinann-bit registerarenumbered insequence from0throughn-1,starting from0
in the rightmost position and increasing the numbers toward the left.
Figure4-1showstherepresentationofregistersinblockdiagram form.
RegisterTransfer:
Informationtransferfromoneregistertoanotherisdesignated insymbolicformbymeansofa
replacementoperator.
ThestatementR2←R1denotesatransferofthecontentofregisterR1intoregisterR2.
Itdesignatesa replacementofthe contentofR2bythecontentofR1.
Bydefinition,thecontent ofthesourceregisterR1doesnotchange afterthetransfer.
Ifwewant thetransfer tooccuronlyunder apredeterminedcontrolconditionthenit canbe shown
by an if-then statement.
if(P=1)thenR2←R1
Pisthe controlsignalgeneratedbyacontrolsection.
Wecanseparatethecontrolvariables fromtheregistertransfer operationbyspecifyinga Control
Function.
ControlfunctionisaBooleanvariablethatisequalto0or1.
controlfunctionisincludedinthestatementas
P:R2←R1
Controlconditionis terminatedbyacolonimpliestransfer operationbeexecutedbythe
hardware only if P=1.
Everystatement writteninaregistertransfernotationimpliesahardwareconstructionfor
implementing the transfer.
Figure4-2showstheblockdiagramthatdepictsthetransferfromR1 to R2.
ThenoutputsofregisterR1areconnectedtotheninputsofregisterR2.
The letternwillbeusedtoindicateanynumberofbits fortheregister. Itwillbereplaced byan actual
number when the length of the register is known.
Register R2hasaloadinput thatisactivatedbythecontrolvariableP.
Itisassumedthatthecontrolvariable issynchronizedwiththesameclockastheoneappliedto the
register.
Asshowninthetimingdiagram, P isactivated in thecontrolsectionbytherisingedge of a
clock pulse at time t.
Thenext positivetransitionoftheclockattime t+1findsthe load input activeandthedatainputs of R2
are then loaded into the register in parallel.
Pmaygo backto 0attime t+1;otherwise, thetransferwilloccurwitheveryclockpulsetransition while P
remains active.
EventhoughthecontrolconditionsuchasPbecomesactive just after time t,theactualtransfer does
not occur untilthe register is triggered bythe next positive transition ofthe clock at time
t+1.
Thebasicsymbolsoftheregistertransfernotationarelistedinbelowtable
BusandMemoryTransfers:
Amoreefficient scheme fortransferring informationbetweenregistersin amultiple-register
configuration is a Common Bus System.
Acommonbusconsistsofasetofcommonlines,one foreachbitofa register.
Controlsignalsdeterminewhichregisterisselectedbythebusduringeachparticular register
transfer.
Different waysofconstructingaCommonBusSystem
UsingMultiplexers
Using Tri-stateBuffers
Commonbussystemiswithmultiplexers:
The multiplexersselectthesourceregisterwhosebinary informationisthenplaced on
the bus.
TheconstructionofabussystemforfourregistersisshowninbelowFigure.
Thebusconsistsoffour4x1multiplexerseachhaving fourdata inputs, 0through3, andtwo
selection inputs, S1and S0.
Forexample, output 1ofregisterAisconnectedtoinput 0ofMUX1becausethis input is labelled A1.
Thediagramshowsthatthebits inthesamesignificant positionineachregisterare connectedto the
data inputs of one multiplexer to form one line of the bus.
ThusMUX0 multiplexesthe four0bitsoftheregisters, MUX1multiplexesthefour1bitsof
theregisters, and similarly for the other two bits.
ThetwoselectionlinesSiandSo areconnectedtotheselectioninputsofallfourmultiplexers.
Theselectionlineschoosethefourbitsofoneregisterandtransferthemintothefour-line
common bus.
WhenS1S0=00,the0datainputsofallfour multiplexersareselectedandappliedtotheoutputs that form
the bus.
Thiscausesthebus linestoreceivethecontentofregister Asincetheoutputsofthisregister are
connected to the 0 data inputs of the multiplexers.
Similarly,registerBisselectedifS1S0=01,andso on.
Table4-2showstheregisterthatisselectedbythe bus foreachofthe fourpossible binaryvalue of the
selection lines.
Ingeneralabus systemhas
multiplex“k” Registers
eachregister of“n” bits
toproduce“n-linebus”
no.ofmultiplexersrequired=n
sizeofeach multiplexer =kx1
Whenthebusisincludesinthestatement,theregistertransferissymbolizedasfollows:
BUS←C,R1←BUS
The content of register C is placed on the bus, and the content of the bus is loaded into register R1
byactivating its load controlinput. Ifthe bus is knownto exist inthe system, it may be convenient just
to show the direct transfer.
R1←C
Three-StateBusBuffers:
Abussystemcanbe constructedwiththree-stategatesinsteadofmultiplexers.
Athree-stategateisadigitalcircuitthat exhibitsthree states.
Twoofthestatesaresignalsequivalent tologic1and0asinaconventionalgate.
Thethirdstateisahigh-impedancestate.
Thehigh-impedancestatebehaveslikeanopencircuit,whichmeansthattheoutputis
disconnected and does not have logic significance.
Becauseofthisfeature,alargenumberofthree-stategateoutputscanbeconnectedwithwiresto form a
common bus line without endangering loading effects.
Thegraphicsymbolofathree-statebuffergateisshowninFig.4-4.
MemoryTransfer:
Thetransferofinformationfromamemorywordtotheoutsideenvironmentiscalled aread
operation.
Thetransferofnewinformationtobestoredintothememoryis calledawriteoperation.
AmemorywordwillbesymbolizedbytheletterM.
Theparticular memorywordamongthe manyavailable isselectedbythe memoryaddressduring the
transfer.
Itisnecessaryto specifythe addressofMwhenwritingmemorytransferoperations.
ThiswillbedonebyenclosingtheaddressinsquarebracketsfollowingtheletterM.
Considera memoryunit thatreceivestheaddressfromaregister,calledtheaddressregister,
symbolized by AR.
Thedataaretransferredto another register,called thedataregister,symbolizedbyDR.
Thereadoperationcanbe statedasfollows:
Read:DR<-M[AR]
RegisterTransferMicro-operations:Transferbinaryinformationfromoneregistertoanother.
ArithmeticMicro-operations: Performarithmeticoperationonnumericdatastoredinregisters.
LogicalMicro-operations:Performbitmanipulationoperationsondatastoredinregisters.
ShiftMicro-operations:Performshiftoperationsondatastoredinregisters.
ArithmeticMicro-operations:
Thebasicarithmeticmicro-operationsare
o Addition
o Subtraction
o Increment
o Decrement
o Shift
ThearithmeticMicro-operationdefined bythestatement belowspecifiestheadd micro-
operation.
R3← R1+R2
Itstatesthatthe contentsofR1are addedtocontentsofR2andsumistransferredtoR3.
Toimplementthisstatement hardwarerequires3registersanddigitalcomponent thatperforms
addition
Subtractionismostoftenimplementedthroughcomplementationandaddition.
Thesubtractoperationisspecifiedbythefollowingstatement
R3←R1 +R2 +1
insteadofminusoperator,wecanwrite as
R2isthesymbolforthe1’scomplementofR2
Adding1to1’scomplement produces2’scomplement
AddingthecontentsofR1tothe2'scomplementofR2isequivalenttoR1-R2.
BinaryAdder:
Digitalcircuitthat formsthearithmeticsumof2bitsandthepreviouscarryiscalledFULL ADDER.
Digitalcircuitthatgeneratesthearithmetic sumof2binarynumbersofanylengthsiscalled
BINARYADDER.
Figure4-6showstheinterconnectionsoffourfull-adders(FA)toprovidea4-bitbinaryadder.
The augends bitsof Aandthe addend bitsof B aredesignated by subscript numbers from
right to left, with subscript 0 denoting the low-order bit.
The carries are connected in a chain through the full-adders. The input carry to the
binaryadderisCoandtheoutputcarryisC4.TheSoutputsofthefull-addersgenerate the
required sum bits.
Ann-bitbinaryadderrequiresnfull-adders.
BinaryAdder–Subtractor:
Theadditionandsubtractionoperationscanbecombined intoonecommoncircuitbyincludingan
exclusive-OR gate with each full-adder.
A4-bitadder-subtractorcircuitisshowninFig.4-7.
BinaryIncrementer:
Theincrementmicrooperationaddsonetoanumberinaregister.
Forexample,ifa4-bitregisterhasabinaryvalue0110,itwillgoto0111afteritisincremented.
Thiscanbeaccomplishedbymeansofhalf-addersconnectedincascade.
Thediagramofa4-bit'combinationalcircuitincrementerisshowninFig.4-8.
ArithmeticCircuit:
Thebasiccomponentofanarithmeticcircuitistheparalleladder.
Bycontrollingthedatainputstotheadder, it is possibletoobtaindifferenttypesofarithmetic
operations.
Thediagramofa4-bit arithmeticcircuit isshowninFig.4-9.It hasfourfull-addercircuits
thatconstitute the 4-bit adder and four multiplexers for choosing different operations.
Therearetwo 4-bitinputsAandB anda4-bitoutputD.
ThefourinputsfromAgodirectlytotheX inputsofthebinaryadder.
Eachofthefourinputs fromB areconnectedtothedatainputsofthemultiplexers.
ThemultiplexersdatainputsalsoreceivethecomplementofB.
Theothertwo datainputsareconnectedtologic-0andlogic-1.
The four multiplexersarecontrolledbytwoselectioninputsS1andS0.Theinput carryCin, goesto
thecarryinputoftheFAinthe least significant position. Theother carriesareconnected fromone stage
to the next.
Bycontrolling thevalueofYwiththetwo selection inputsS1and S0and making Cinequalto 0or 1,it is
possible to generatethe eight arithmetic microoperations listed in Table 44.
Addition:
Subtraction:
WhenS1S0=01,thecomplementofBisappliedtothe Yinputsoftheadder.
IfCin=1,thenD=A+B+1.ThisproducesAplusthe2'scomplement ofB, which
isequivalent to a subtraction of A -B.
WhenCin=0thenD=A+B. This isequivalenttoasubtract withborrow,that is,A-B-
1.
Increment:
LogicMicro-operations:
Logicmicrooperationsspecifybinaryoperationsforstringsofbitsstoredinregisters.
Theseoperationsconsidereachbitoftheregisterseparatelyandtreat themasbinaryvariables.
Forexample,theexclusive-ORmicrooperationwiththecontentsoftworegistersRIandR2 is
symbolized by the statement
ListofLogicMicrooperations:
Thereare16differentlogicoperationsthat canbeperformedwithtwobinaryvariables.
Theycanbedetermined fromallpossibletruthtablesobtained withtwo binaryvariablesas shown
in Table 4-5.
Selectivecomplement
Theselective-complementoperationcomplementsbitsinAwheretherearecorresponding 1's
in B. It doesnot affect bit positions that have 0's in B. For example:
Insert
Theinsert operationinsertsa newvalueintoa groupof bits.Thisisdonebyfirst maskingthebits and
then ORing them with the required value.
For example, suppose that an A register contains eight bits, 0110 1010. Toreplace the four leftmost
bits by the value 1001 we first mask the four unwanted bits:
Clear
The clear operation compares the words in A and B and produces an all 0's result if the two
numbers are equal. This operation is achieved by anexclusive-OR microoperationas shown
by the following example
ShiftMicrooperations:
Shiftmicrooperationsareusedforserialtransferofdata.
Thecontentsofaregistercanbe shiftedtotheleftorthe right.
Duringashift-leftoperationtheserialinputtransfersabitintotherightmost position.
Duringashift-rightoperationtheserialinputtransfersabitintotheleftmost position.
Therearethreetypesofshifts:logical,circular,and arithmetic.
ThesymbolicnotationfortheshiftmicrooperationsisshowninTable4-7.
LogicalShift:
o Alogical shiftisonethattransfers0throughtheserialinput.
o Thesymbolsshlandshrforlogicalshift-leftandshift-rightmicrooperations.
o Themicrooperationsthatspecifya1-bitshifttotheleftofthecontentofregister Randa
1-bitshifttotherightofthecontentofregisterRshownintable4.7.
o The bit transferred tothe end positionthroughthe serial input is assumed to be 0 duringa
logical shift.
CircularShift:
o Thecircularshift (also knownasarotateoperation)circulatesthebitsoftheregister
around the two ends without loss of information.
o Thisisaccomplished byconnectingtheserialoutputoftheshiftregistertoitsserialinput.
o Wewillusethesymbolscilandcirforthecircularshiftleftandright, respectively.
ArithmeticShift:
o Anarithmeticshiftisa microoperationthat shiftsasigned binarynumbertothe left or right.
o Anarithmeticshift-leftmultipliesasignedbinarynumber by2.
o Anarithmeticshift-rightdividesthenumber by2.
o Arithmeticshiftsmust leavethesignbitunchangedbecausethesignofthenumber
remains the samewhen it is multiplied or divided by 2.
HardwareImplementation:
AcombinationalcircuitshiftercanbeconstructedwithmultiplexersasshowninFig.4-12.
The4-bitshifterhasfourdatainputs,A0throughA3,andfour dataoutputs,H0through H3.
Therearetwoserialinputs,oneforshiftleft(IL)andtheotherforshiftright(IR).
WhentheselectioninputS=0theinputdataareshifted right(downinthediagram).
WhenS=1,theinputdataareshiftedleft(upinthediagram).
ThefunctiontableinFig.4-12showswhichinput goestoeachoutput aftertheshift.
Ashifterwithndatainputsandoutputsrequiresnmultiplexers.
Thetwo serialinputscanbecontrolledbyanothermultiplexertoprovidethethreepossibletypesof shifts.
ArithmeticLogicShiftUnit:
Instead ofhaving individual registers performing the microoperations directly, computer systems
employanumberofstorageregistersconnectedtoacommonoperationalunit calledan arithmetic logic
unit, abbreviated ALU.
The ALU is a combinational circuit so that the entire register transfer operation fromthe
sourceregistersthroughtheALUand intothedestinationregister canbeperformedduring one clock
pulse period.
Theshift microoperationsareoftenperformedinaseparateunit,butsometimestheshiftunit is made
part of the overall ALU.
The arithmetic, logic, and shift circuits introduced in previous sections can be combined into one
ALUwithcommonselectionvariables.Onestageofanarithmetic logicshift unit isshowninFig. 4-13.
Particular microoperationisselectedwithinputsS1andS0.A4x1multiplexerattheoutput chooses
between an arithmetic output in Diand a logic output in Ei.
Thedatainthe multiplexerareselected withinputsS3andS2.Theothertwodatainputstothe
multiplexer receive inputsAi-1fortheshift-rightoperationandAi+1fortheshift-leftoperation.
Thecircuit whoseonestageisspecifiedinFig.4-13provideseight arithmeticoperation,four logic
operations, and two shift operations.
EachoperationisselectedwiththefivevariablesS3,S2,S1, S0andCin.
Theinput carryCin isusedforselectinganarithmeticoperationonly.
Table4-8liststhe14operationsoftheALU.The first eight arearithmeticoperations andare
selected with S3S2= 00.
Thenextfour arelogicandareselected withS3S2=01.
Theinputcarryhasnoeffectduringthelogicoperationsand is markedwithdon't-care x’s.
ThelasttwooperationsareshiftoperationsandareselectedwithS3S2=10and11.
Theotherthreeselectioninputshavenoeffectonthe shift.
BASICCOMPUTERORGANIZATION
AND DESIGN
CONTENTS:
InstructionCodes
ComputerRegisters
ComputerInstructions
TimingAndControl
InstructionCycle
Register–Reference Instructions
Memory–ReferenceInstructions
Input–OutputAndInterrupt
1. Instruction Codes:
Theorganizationofthecomputerisdefined byitsinternalregisters,thetimingandcontrolstructure, and the
set of instructions that it uses.
Internalorganizationofacomputer isdefined bythesequenceofmicro-operations it performson data
stored in its registers.
Computercanbeinstructedaboutthespecificsequenceofoperationsitmust perform.
UsercontrolsthisprocessbymeansofaProgram.
Program:setofinstructionsthatspecifytheoperations,operands,andthesequencebywhich
processing has to occur.
Instruction:abinarycodethat specifiesasequenceofmicro-operationsfor the computer.
Thecomputerreadseachinstructionfrommemoryandplaces it inacontrolregister.Thecontrol then
interpretsthebinarycode ofthe instructionandproceedstoexecute it byissuing a sequenceofmicro-
operations. – Instruction Cycle
InstructionCode: groupofbitsthat instructthecomputertoperformspecific operation.
Instructioncodeisusuallydividedintotwoparts:Opcodeandaddress(operand)
OperationCode (opcode):
groupofbitsthatdefinetheoperation
Eg:add,subtract,multiply,shift,complement.
No.ofbitsrequiredforopcodedependsonno.ofoperations availablein computer.
nbitopcode>=2n(orless)operations
Address(operand):
specifiesthelocationofoperands(registersormemorywords)
Memorywordsarespecifiedbytheiraddress
Registersarespecifiedbytheirk-bitbinarycode
k-bitaddress>=2kregisters
StoredProgramOrganization:
Theabilityto storeandexecuteinstructionsis the most important propertyofageneral-purpose
computer. That type of stored program concept is called stored programorganization.
The simplest wayto organize a computer isto have one processorregister and
aninstructioncodeformat withtwoparts.The firstpart specifiestheoperation to be
performed and the second specifies an address.
Thebelowfigureshowsthestoredprogramorganization
Instructionsarestoredinonesectionofmemoryanddatainanother.
Foramemoryunitwith4096wordswe need 12bitsto specifyanaddresssince212=4096.
If we store each instruction code in one 16-bit memoryword, we have available four bits for the
operation code (abbreviated opcode) to specify one out of 16 possible operations, and 12 bits to
specify the address of an operand.
Accumulator (AC):
Computers that have a single-processor register usually assign to it the name accumulator
and label it AC.
Theoperationisperformedwiththememoryoperandandthecontent ofAC.
AddressingofOperand:
Sometimesconvenientto usetheaddressbitsofaninstructioncodenotasanaddressbut asthe
actualoperand.
Whenthesecondpart ofaninstructioncodespecifiesanoperand,theinstruction issaidtohave an
immediateoperand.
Whenthesecondpart specifiestheaddressofanoperand, theinstructionissaidtohavea direct
address.
Whensecond partofthe instructiondesignateanaddressofa memoryword inwhichthe addressof the
operand is found such instruction have indirect address.
Onebitoftheinstructioncodecanbeusedtodistinguishbetweenadirectandanindirectaddress.
The instruction code format shown in Fig. 5-2(a). It consists of a 3-bit operation code, a 12-bit
address,andanindirect addressmodebit designatedbyI.Themodebit is0foradirect addressand1 for an
indirect address.
AdirectaddressinstructionisshowninFig.5-2(b).
It isplaced inaddress22 in memory. The I bit is0, sothe instruction isrecognized asa direct address
instruction. The opcode specifies an ADD instruction, and the address part is the binaryequivalent of
457.
Thecontrolfindstheoperandinmemoryataddress457andaddsittothecontentofAC.
Theinstructioninaddress35showninFig.5-2(c)hasamodebitI=1.
Therefore,itisrecognizedasanindirectaddressinstruction.
Theaddresspart isthe binaryequivalent of300.The controlgoes to address300to findthe addressof the
operand. The address of the operand in this case is 1350.
Theoperandfoundinaddress1350 isthenaddedtothecontentofAC.
Theeffectiveaddresstobetheaddressoftheoperand inacomputation-type instructionorthe target
address in a branch-type instruction.
Thustheeffectiveaddress inthe instructionofFig. 5-2(b) is457and inthe instructionofFig5-2(c)
is1350.
2. ComputerRegisters:
Whatistheneedforcomputer registers?
Theneedoftheregistersincomputerfor
Instructionsequencingneedsacountertocalculatetheaddressofthenext instruction after
execution of the current instruction is completed (PC).
Necessarytoprovidearegister inthecontrolunit forstoringtheinstructioncodeafterit is
read from memory (IR).
Needsprocessorregistersformanipulatingdata(ACandTR)andaregisterforholding a
memory address (AR).
TheaboverequirementsdictatetheregisterconfigurationshowninFig.5-3.
Theregistersarealso listedinTable5.1togetherwithabriefdescriptionoftheir functionandthe number
of bits that they contain.
Thedataregister(DR)holdstheoperand readfrommemory.
Theaccumulator(AC)register isageneralpurposeprocessingregister.
Theinstructionreadfrommemoryisplacedintheinstructionregister(IR).
Thetemporaryregister(TR)isusedforholdingtemporarydataduringtheprocessing.
Thememoryaddressregister(AR)has12 bitssincethisisthewidthofamemoryaddress.
Theprogramcounter(PC) also has12bitsand it holdstheaddressofthe next instructionto beread from
memory after the current instruction is executed.
Tworegistersareusedfor inputandoutput.
Theinputregister(INPR)receivesan8-bitcharacterfromaninputdevice.
Theoutput register(OUTR)holdsan8-bitcharacterforanoutput device.
CommonBusSystem:
Thebasiccomputerhaseight registers, amemoryunit, andacontrolunit
Pathsmust beprovidedtotransfer informationfromoneregistertoanotherandbetweenmemoryand
registers.
Amoreefficient scheme fortransferring informationinasystemwithmanyregistersisto usea
common bus.
Theconnectionoftheregistersand memoryofthebasiccomputerto acommonbussystemis shown
in Fig. 5-4.
Theoutputsofsevenregistersandmemoryare connectedtothecommonbus.
Thespecificoutputthatisselectedforthebus lines atanygiventime isdetermined fromthebinary value
of the selection variables S2, S1, and S0.
Thenumberalongeachoutputshowsthedecimalequivalent oftherequiredbinaryselection.
Forexample, thenumberalongtheoutputofDR is 3.The16-bit outputsofDRareplacedonthebus lines
when S2S1S0= 011.
3. ComputerInstructions:
Thebasiccomputerhasthreeinstructioncodeformats,asshowninFig.5-5.Eachformathas16bits.
Theoperationcode(opcode)partofthe instructioncontainsthreebitsandthemeaningofthe
remaining 13 bits depends on the operation code encountered.
Amemory-reference instructionuses12bitsto specifyanaddressandonebit tospecifythe
addressing mode I.
Iisequalto0fordirect addressandto1forindirect address.
Theregister-reference instructionsarerecognizedbytheoperationcode1.11witha0intheleftmostbit (bit 15)
of the instruction.
Aregister-reference instructionspecifiesanoperationontheACregister.So anoperandfrom memory is not
needed. Therefore, the other 12 bits are used to specifythe operation to be executed.
Aninput—outputinstructiondoesnotneedareferencetomemoryand isrecognizedbythe
operation code 111 with a 1 in the leftmost bit ofthe instruction.
Theremaining12bitsareusedtospecifythetypeofinput—outputoperation.
TheinstructionsforthecomputerarelistedinTable5-2.
Thesymboldesignationisathree-letterwordandrepresentsanabbreviationintended for
programmers and users.
Thehexadecimalcodeisequaltotheequivalent hexadecimalnumber ofthebinarycodeused forthe
instruction.
InstructionSetCompleteness:
Acomputershouldhaveasetofinstructionssothattheusercanconstructmachine language
programs to evaluate any function.
The set ofinstructions are said to be complete ifthe computer includes a sufficient number of
instructions in each of the following categories:
o Arithmetic,logical,andshiftinstructions
o DataInstructions(formovinginformationtoandfrommemoryandprocessorregisters)
o ProgramcontrolorBrach
o Inputandoutput instructions
Thereisonearithmeticinstruction,ADD,andtworelatedinstructions,complementAC(CMA)and
increment AC(INC). With these three instructions we can add and subtract binary numbers when
negative numbers are in signed-2's complement representation.
Thecirculateinstructions,CIRandCIL;canbeusedfor arithmeticshiftsaswellasany
othertype of shifts desired.
Therearethreelogicoperations:AND,complement AC(CMA),andclearAC(CLA).TheAND
andcomplement provide a NAND operation.
Moving informationfrommemoryto ACisaccomplishedwiththe loadAC(LDA) instruction.
Storinginformation from AC into memory is done with the store AC (STA) instruction.
ThebranchinstructionsBUN,BSA,andISZ,togetherwiththefourskipinstructions,
providecapabilities for program control and checking of status conditions.
Theinput (INP}andoutput(OUT)instructionscauseinformationtobetransferredbetweenthe
computer and external devices.
4. TimingandControl:
Thetimingforallregistersinthebasiccomputeriscontrolledbyamasterclockgenerator.
Theclockpulsesareappliedtoallflip-flopsandregistersinthesystem, includingtheflip-flops
andregisters in the control unit.
Theclockpulsesdo notchangethestateofaregisterunlesstheregisterisenabled byacontrol signal.
Thecontrolsignalsaregenerated inthecontrolunit and providecontrolinputs forthemultiplexersin the
common bus, control inputs in processor registers, and microoperations for the accumulator.
Therearetwomajortypesofcontrolorganization:
o Hardwiredcontrol
o Microprogrammedcontrol
Thedifferencesbetweenhardwiredandmicroprogrammedcontrolare
Hardwiredcontrol Microprogrammedcontrol
Theoutputsofthe counteraredecodedinto16timingsignalsT0throughT15.
ThesequencecounterSCcanbeincrementedorclearedsynchronously.
Thecounterisincrementedtoprovidethesequenceoftimingsignalsoutofthe4x16decoder.
Asanexample, considerthecasewhere SC is incrementedtoprovidetimingsignalsT0,
T1,T2,T3andT4in sequence. At time T4, SC is cleared to 0 if decoder output D3 is active.
Thisisexpressedsymbolicallybythe statement
D3T4:SC0
ThetimingdiagramofFig.5-7 showsthetimerelationshipofthecontrolsignals.
ThesequencecounterSCrespondstothepositivetransitionoftheclock.
Initially,theCLR inputofSC isactive. ThefirstpositivetransitionoftheclockclearsSCto0,which inturn
activates the timing signalT0out of the decoder. T0is active during one clock cycle.
SCis incrementedwitheverypositiveclocktransition,unlessitsCLRinputisactive.
ThisproducesthesequenceoftimingsignalsT0,T1,T2,T3,T4andso on,asshowninthediagram.
ThelastthreewaveformsinFig.5-7showhowSCiscleared whenD3T4=1.
OutputD3fromtheoperationdecoderbecomesactive at theend oftiming signalT2.
WhentimingsignalT4becomesactive,theoutputoftheANDgatethat implementsthecontrol function
D3T4becomes active.
ThissignalisappliedtotheCLR input ofSC. Onthe next positiveclocktransition(the
onemarked T4 in the diagram) the counter is cleared to 0.
ThiscausesthetimingsignalT0to becomeactive insteadofT5that wouldhave beenactive if SCwere
incremented instead of cleared.
5. InstructionCycle:
Aprogramresidinginthememoryunitofthe computerconsistsofa sequenceofinstructions.
Theprogramisexecutedinthecomputerbygoingthroughacycleforeach instruction.
Eachinstructioncycleinturnissubdividedintoasequenceofsubcyclesorphases.
Inthebasiccomputereachinstructioncycle consistsofthefollowingphases:
1. Fetchaninstructionfrom memory.
2. Decodetheinstruction.
3. Readtheeffectiveaddressfrommemoryiftheinstructionhasanindirectaddress.
4. Executetheinstruction.
Uponthecompletionofstep4,thecontrolgoesbacktostep1tofetch, decode, and
execute the next instruction.
FetchandDecode:
Initially,theprogramcounterPCisloadedwiththeaddressofthefirstinstruction intheprogram.
ThesequencecounterSCisclearedto0,providingadecodedtimingsignalT0.
Themicrooperationsforthefetchanddecodephasescanbespecifiedbythefollowingregister transfer
statements.
DeterminetheTypeofInstruction:
ThetimingsignalthatisactiveafterthedecodingisT3.
DuringtimeT3,thecontrolunitdeterminethetypeofinstructionthatwasreadfromthememory.
The flowchartoffig.5-9showsthe initialconfigurationsfortheinstructioncycleandalso howthe
control determines the instruction cycle type after the decoding.
DecoderoutputD7isequalto 1iftheoperationcodeisequalto binary111.
IfD7=1,theinstructionmust bearegister-referenceorinput-outputtype.
IfD7= 0,theoperationcodemust beoneoftheother sevenvalues000through110,specifyinga memory-
reference instruction.
Controltheninspectsthevalueofthefirstbitoftheinstruction,whichisnowavailableinflip-flopI.
IfD7=0andI=1,indicatesamemory-referenceinstructionwithanindirect address.Soit isthen necessary
to read the effective address from memory.
IfD7= 0andI=0,indicatesa memory-referenceinstructionwithadirect address.
IfD7= 1andI=0, indicatesaregister-referenceinstruction.
IfD7=01and I=1,indicatesan input-outputinstruction.
Thethreeinstructiontypesaresubdividedintofourseparatepaths.
Theselectedoperationisactivatedwiththeclocktransitionassociated withtimingsignal T3.
Thiscanbesymbolized asfollows:
Register-ReferenceInstructions:
Register-referenceinstructionsarerecognized bythecontrolwhenD7=1andI=0.
Theseinstructionsusebits0through11oftheinstructioncodetospecifyoneof12instructions.
These12bitsareavailableinIR(0-11).
Thecontrolfunctionsandmicrooperationsfortheregister-referenceinstructionsarelistedinTable 5-3.
TheseinstructionsareexecutedwiththeclocktransitionassociatedwithtimingvariableT3.
ControlfunctionneedstheBooleanrelationD7I’T3, whichwedesignateforconvenience bythe
symbol r.
ByassigningthesymbolB itobitiofIR,allcontrolfunctions canbesimplydenotedbyrBi.
6. Memory-ReferenceInstructions:
Table5-4liststhesevenmemory-referenceinstructions.
Thedecodedoutput Difor i=0,1,2,3,4,5,and6fromtheoperationdecoderthat belongstoeach instruction
is included in the table.
Theeffectiveaddressofthe instructionisintheaddress registerARandwasplacedthereduring timing
signal T2when I= 0, or during timing signal T3when I = 1.
Theexecutionofthememory-referenceinstructionsstartswithtiming signalT4.
Thesymbolicdescriptionofeachinstructionisspecified inthetable intermsofregistertransfer
notation.
ANDtoAC:
Thisisaninstructionthat performstheAND logicoperationonpairsofbitsinACandthe
memoryword specified by the effective address.
Theresult oftheoperationistransferredto AC.
Themicrooperationsthatexecutethisinstructionare:
ADDtoAC:
Thisinstructionaddsthecontent ofthememorywordspecified bytheeffectiveaddresstothe valueof
AC.
Thesumistransferred intoACandtheoutput carry CoutistransferredtotheE(extended
accumulator) flip-flop.
The microoperationsneededtoexecutethisinstructionare
LDA:LoadtoAC
Thisinstructiontransfersthememorywordspecifiedbytheeffectiveaddressto AC.
Themicrooperationsneededtoexecutethisinstruction are
STA:StoreAC
BUN:BranchUnconditionally
Thisinstructiontransferstheprogramtotheinstructionspecified bytheeffectiveaddress.
TheBUN instructionallowstheprogrammertospecifyaninstructionoutofsequenceandwesay that the
program branches (or jumps) unconditionally.
Theinstructionisexecutedwithone microoperation:
BSA:BranchandSaveReturnAddress
TheBSAinstructionisassumedtobeinmemoryataddress20.
TheIbitis0 andtheaddresspartoftheinstructionhasthebinaryequivalentof135.
Afterthefetchanddecodephases, PCcontains21, whichistheaddressofthenext instructioninthe
program(referred to as the return address). AR holds the effective address 135.
Thisisshown inpart(a)ofthefigure.
TheBSAinstructionperformsthefollowingnumericaloperation:
ISZ:IncrementandSkipifZero
ControlFlowchart:
Aflowchart showingallmicrooperationsfortheexecutionofthesevenmemory-reference
instructions is shown in Fig. 5.11.
7. Input-OutputandInterrupt:
Instructionsanddatastoredinmemorymustcomefromsomeinput device.
Computationalresultsmustbetransmittedtotheuserthroughsomeoutput device.
Todemonstratethemost basicrequirementsforinputandoutputcommunication,wewilluseasan
illustration a terminal unit with a keyboard and printer.
Input-OutputConfiguration:
Theterminalsendsandreceivesserialinformation.
Eachquantityofinformationhaseightbitsofanalphanumericcode.
TheserialinformationfromthekeyboardisshiftedintotheinputregisterINPR.
Theserialinformationfor theprinterisstoredintheoutput registerOUTR.
Thesetworegisterscommunicate withacommunication interfaceseriallyandwiththeACinparallel.
Theinput—outputconfigurationisshowninFig.5-12.
Theinputregister INPRconsistsofeightbitsandholdsalphanumericinputinformation.
The1-bitinputflagFGIisacontrolflip-flop.
Theflagbit isset to1whennewinformationisavailableinthe input deviceandiscleared to 0
when the information is accepted by the computer.
TheoutputregisterOUTRworkssimilarlybutthedirectionofinformationflowisreversed.
Initially,theoutputflagFGOissetto1.
Thecomputercheckstheflag bit;ifit is1,the informationfromAC istransferred inparallelto
OUTRand FGO is cleared to 0.
Theoutputdeviceacceptsthecodedinformation,printsthecorrespondingcharacter,andwhenthe
operation is completed, it sets FGO to 1.
Input-OutputInstructions:
Input andoutputinstructionsareneeded fortransferring informationto andfromACregister,for
checking the flag bits, and for controlling the interrupt facility.
Input-outputinstructionshaveanoperationcode1111andarerecognizedbythecontrolwhenD7=1 and I =
1.
Theremaining bitsoftheinstructionspecifytheparticular operation.
Thecontrolfunctionsandmicrooperationsfortheinput-outputinstructionsarelistedinTable5-5.
TheseinstructionsareexecutedwiththeclocktransitionassociatedwithtimingsignalT3.
EachcontrolfunctionneedsaBooleanrelationD7IT3,whichwedesignate for conveniencebythe
symbol p.
Thecontrolfunction isdistinguishedbyoneofthebitsinIR(6-11).
Byassigning thesymbolBito bit iofIR, allcontrolfunctionscan bedenotedbypBifor i=6
though11.
ThesequencecounterSCiscleared to0 whenp=D7IT3=1.
Thelasttwoinstructionssetandclearaninterruptenableflip-flopIEN.
ProgramInterrupt:
Thecomputerkeepscheckingtheflagbit,andwhenitfindsitset,itinitiatesaninformationtransfer.
Thedifferenceofinformationflowratebetweenthecomputerandthatofthe input—output device makes
this type of transfer inefficient.
Analternativetotheprogrammedcontrolledprocedureisto lettheexternaldevice informthe
computer when it is ready for the transfer.
Inthe meantimethecomputercanbe busywithothertasks. Thistypeoftransferusesthe interrupt facility.
Whilethecomputerisrunningaprogram,itdoesnotchecktheflags.
Whenaflagisset,thecomputerismomentarilyinterrupted fromthecurrent program.
Thecomputerdeviatesmomentarilyfromwhatitisdoingto performoftheinputoroutputtransfer.
Itthenreturnstothecurrentprogramtocontinuewhatitwasdoing beforetheinterrupt.
The interruptenableflip-flopIENcanbesetandclearedwithtwo instructions.
o WhenIENiscleared to0(withtheIOF instruction),theflagscannotinterruptthe computer.
o WhenIENissetto(withtheIONinstruction),thecomputercanbeinterrupted.
Thewaythattheinterrupt ishandled bythecomputercanbeexplained bymeansoftheflowchartof Fig. 5-
13.
Aninterruptflip-flopR is included inthecomputer. WhenR = 0,thecomputergoesthroughan
instruction cycle.
DuringtheexecutephaseoftheinstructioncycleIENischeckedbythecontrol.
Ifit is0, it indicatesthattheprogrammerdoesnotwant tousethe interrupt,socontrolcontinues with
the next instruction cycle.
IfIENis1, controlcheckstheflag bits. Ifbothflagsare0,it indicatesthatneithertheinput northe output
registers are ready for transfer of information. In this case, controlcontinues
withthenextinstructioncycle.
Ifeither flag issetto 1while 1EN= 1,flip-flopRisset to1.Attheendoftheexecutephase,
controlchecks the value ofR, and ifit is equalto 1,it goesto an interrupt cycle instead ofan
instruction cycle.
Interruptcycle:
The interruptcycleisahardwareimplementationofabranchandsavereturnaddressoperation.
ThereturnaddressavailableinPCisstoredinaspecificlocation.
Thislocationmaybeaprocessorregister,amemorystack,oraspecificmemorylocation.
AnexamplethatshowswhathappensduringtheinterruptcycleisshowninFig.5-14.
Whatisacontrolunit?
Thefunctionofthe controlunitinadigitalcomputeristoinitiatesequence ofmicrooperations.
Controlunit canbeimplementedintwo ways
Hardwired control
Microprogrammed control
HardwiredControl:
Whenthecontrolsignalsaregeneratedbyhardwareusingconventionallogicdesigntechniques, the
control unit is said to be hardwired.
Thekeycharacteristicsare
High speed of operation
Expensive
Relativelycomplex
Noflexibilityofaddingnewinstructions
Microprogrammed Control:
Controlinformationisstoredincontrolmemory.
Controlmemoryisprogrammedtoinitiatetherequiredsequenceofmicro-operations. The
key characteristics are
Speedofoperationislowwhencomparedwithhardwired Less
complex
Lessexpensive
Flexibilitytoaddnewinstructions
ControlMemory
The control function that specifies a microoperation is called as control variable.
Whencontrolvariableisinonebinarystate,thecorrespondingmicrooperationisexecuted. For
the other binary state the state of registers does not change.
Theactivestate ofacontrolvariablemaybeeither1state orthe0state,dependingonthe application.
Example;
Forbus-organizedsystemsthecontrolsignalsthatspecifymicrooperationsaregroupsofbitsthat select
the paths in multiplexers, decoders, and arithmetic logic units.
ControlWord:
controlword.
Allcontrolwordscanbeprogrammedtoperformvariousoperationsonthecomponentsofthe system.
Microprogramcontrolunit:
Acontrolunitwhosebinarycontrolvariablesarestoredinmemoryiscalledamicroprogram control unit.
The control word in control memory contains within it a microinstruction.
Themicroinstructionspecifiesoneormoremicro-operationsforthesystem. A
sequence of microinstructions constitutes a microprogram.
Thecontrolunitconsistsofcontrolmemoryusedtostorethemicroprogram.
Control memory is a permanent i.e., read only memory (ROM).
Thegeneralconfigurationofamicro-programmedcontrolunitorganizationisshownasblock diagram
below.
ThecontrolmemoryisROMsoallcontrolinformationispermanentlystored.
The control memory address register (CAR) specifies the address of the microinstruction and the control data
register(CDR) holds the microinstruction read from memory.
The next address generator is sometimes called a microprogram sequencer. It is used to generate the next micro
instruction address.
The location of the next microinstructionmay be the one next in sequence or it may be locatedsomewhere else in
thecontrol memory.
So it is necessary to use some bits of the present microinstruction to control the generation of the address of
themicroinstruction.
Sometimesthenextaddressmayalsobeafunctionofexternalinputconditions.
The control data register holds the present microinstruction while next address is computed and read from
memory.The data register is sometimes called a pipeline register.
Acomputerwithamicroprogrammedcontrolunitwillhavetwoseparatememories: a
main memory (RAM)
controlmemory(ROM)
Themicroprogramconsistsofmicroinstructionsthatspecifyvariousinternalcontrolsignalsfor
execution of register microoperations
Thesemicroinstructionsgeneratethemicrooperationsto: fetch
the instruction from main memory
evaluatetheeffectiveaddress
execute the operation
returncontroltothefetchphaseforthenextinstruction
Addressingsequence
Microinstructionsarestoredincontrolmemoryingroups,witheachgroupspecifyingaroutine. Each
computer instruction has its ownmicroprogramroutine to generate the microoperations.
Thehardwarethatcontrolstheaddresssequencingofthecontrolmemory mustbecapableofsequencing the
microinstructions within a routine and be able to branch from one routine to another
Steps the control must undergo during the execution of a single computer instruction are as follows
Initialaddressisloadedintothecontroladdressregister(CAR)whenpoweristurnedoninthecomputer.
Thisaddressisusuallytheaddressofthefirstmicroinstructionthatactivatestheinstructionfetchroutine.
Attheendofthefetchroutineinstructionisplacedintheinstructionregister-IR
Thecontrolmemorythengoesthroughtheroutinetodetermine theeffectiveaddressoftheoperandwith the help
of mode bits and branch micro instructions
AttheendofthisroutineAddressregisterARholdsoperandaddress
Thenextstepistogeneratethemicrooperationsthatexecutetheinstructionfetchedfrommemoryby considering the
opcode and applying a mapping process.
Thetransformationoftheinstructioncodebitstoanaddressincontrolmemorywheretheroutineof instruction
located is referred to as mapping process.
Afterexecution,controlmustreturntothefetchroutinebyexecutinganunconditionalbranch
In brief the address sequencing capabilities required in a control memory are:
1. Incrementingofthecontroladdressregister.
2. Unconditionalbranchorconditionalbranch,dependingonstatusbitconditions.
3. Amappingprocessfromthebitsoftheinstructiontoanaddressforcontrolmemory.
4. Afacilityforsubroutinecallandreturn.
Themicroinstructionincontrolmemorycontainsasetofbitsto
Selectionofaddressforcontrolmemory
initiatemicrooperationsincomputerregistersandotherbitsto
specifythemethodbywhichthenextaddressisobtained.
In the figure four different paths form which the control address register
(CAR) receives the address.
Threefieldsforaninstruction:
1-bitfieldfordirect/indirectaddressing
4-bit opcode
11-bitaddressfield
Microinstructionformat
Themicroinstructionformatiscomposedof20bitsdividedintofour parts
ThreefieldsF1,F2,andF3specifymicrooperationsforthe
computer [3 bits each]
TheCDfieldselects statusbitconditions[2bits]
TheBRfieldspecifiesthe typeofbranchtobeused[2bits]
TheADfieldcontainsabranchaddress[7bits]becausecontrol
memory has 128 words
Eachofthethreemicrooperationfieldscanspecifyoneofseven
possibilities.
Nomorethanthreemicrooperationscanbechosenfora
microinstruction.
Iffewerthanthreeareneeded,thecode000=NOP.
Thethreebitsineachfieldareencodedtospecifysevendistinct
microoperations listed in below table.
Theconditionfield(CD)istwobitstospecifyfourstatusbit
conditions .
The branch field (BR) consists of two bits and is used with the
addressfieldtochoosetheaddressofthenextmicroinstruction.
Symbolicmicroinstructions
Differentsymbolscanbeusedtoconstructthemicroinstructionsinsymbolicform.
Eachlineof anassemblylanguagemicroprogramdefinesasymbolicmicroinstructionandisdividedintofiveparts Lable
Microoperations
CD
BR
AD
1. Thelabelfieldmaybeemptyoritmayspecifyasymbolicaddress.Terminatewithacolon(:).
2. Themicrooperationsfieldconsistsof1-3symbols,separatedbycommas.Onlyonesymbolfromeachfield.IfNOP, then
translated to 9 zeros
3. Theconditionfield specifiesoneofthefourconditionsU,I,S,Z.
4. ThebranchfieldhasoneofthefourbranchsymbolsJMP,CALL,RET,MAP
5. Theaddressfieldhasthreeformats
a. Asymbolicaddress mustalsobea label
b. ThesymbolNEXTtodesignatethenextaddressinsequence
c. Emptyif thebranchfieldisRETorMAPandisconvertedto 7zeros
ThesymbolORGdefinestheorigini;ethefirstaddressofamicroprogramroutine.
Eg;ORG64 placesfirstmicroinstructionatcontrolmemory1000000whichisequivalenttodecimalnumber64.
Fetchroutine
Thecontrolmemoryhas128locations,eachoneis20bits.
Thefirst64locationsareoccupiedbytheroutinesforthe16
instructions, addresses 0-63.
thefetchroutinestartsataddress64.
Thefetchroutinerequiresthefollowingthree microinstructions
at locations 64-66.
Themicroinstructionsneededforfetchroutineare:
TheaddressofinstructionistransferredfromPCtoARand the
instruction is read from memory into DR and PC is
incremented.
The address part is transferred to AR and the control is
transferredtooneof16routinesbymappingtheoperation code
part of the instruction from DR into CAR.
Usingassemblylanguageconventionslikeabovewecanwrite symbolic
micro programs as shown in the table.
Designofcontrol unit
Thecontrolmemoryoutof eachsubfieldmustbedecodedtoprovide the
distinct microoperations.
Theoutputsofthedecodersareconnectedtotheappropriateinputsin the
processor unit.
Thefigureshowsthethreedecodersandsomeof theconnectionsthat must be
made from their outputs.
Thethreefieldsof themicroinstructionintheoutputofcontrolmemory are
decoded with a 3x8 decoder to provide eight outputs.
Eachoftheoutputmustbeconnectedtopropercircuittoinitiatethe
corresponding microoperation.
WhenF1=101(binary5),thenextpulsetransitiontransfersthecontent of DR
(0-10) to AR.
Similarly,whenF1=110(binary6)thereisatransferfromPCtoAR (symbolized
by PCTAR).
AsshowninFig,outputs5and6 ofdecoderF1areconnectedtothe load
input of AR so that when either one of these outputs is active,
information from the multiplexers is transferred to AR.
ThemultiplexersselecttheinformationfromDRwhenoutput5isactive and
from PC when output 5 is inactive.
ThetransferintoARoccurswithaclocktransitiononlywhenoutput5or output
6 of the decoder is active.
For the arithmetic logic shift unit the control signals are instead of
comingfromthelogicalgates,nowtheseinputswillnowcome fromthe
outputs of AND, ADD and DRTAC respectively.
Microprogram
sequencer
Thebasiccomponentsofamicroprogrammedcontrolunitare
the control memory and the circuits that select the next
address.
Theaddressselectioniscalledamicroprogramsequencer.
The purpose of a microprogram sequencer is to present an
address to the control memory so that a microinstruction may
be read and executed.
The next-address logic of the sequencer determines
thespecificaddresstobeloadedintothecontroladdressregister.
The block diagram of the microprogram sequencer is shown in
the figure.
The control memory is included in the diagram to show the
interaction between the sequencer and the memory attached
to it.
Therearetwomultiplexersinthecircuit.
1. The first multiplexer selects an address from one of four
sources and routes it into control address register CAR.
2. The second multiplexer tests the value of a selected status
bit and the result of the test is applied to an input logic
circuit.
The output from CAR provides the address for the control
memory.
The content of CAR is incremented and applied to one of the
multiplexer inputs and to the subroutine register SBR.
Theotherthreeinputstomultiplexercomefrom
1. Theaddressfieldofthepresentmicroinstruction
2. FromtheoutofSBR
3. Fromanexternalsourcethatmapstheinstruction
The CD (condition) field of the microinstruction selects one of
the status bits in the second multiplexer.
If the bit selected is equal to 1, the T variable is equal to 1;
otherwise, it is equal to 0.
The T value together with two bits from the BR (branch) field
goes to an input logic circuit.
The input logic in a particular sequencer will determine the
type of operations that are available in the unit.
The input logic circuit in above figure has three inputs I0, I1,
and T, and three outputs, S0, S1, and L.
Variables S0 and S1 select one of the source addresses for
CAR. Variable L enables the load input in SBR.
The binary values of the selection variables determine the
path in the multiplexer.
For example, with S1,S0 = 10, multiplexer input number 2 is
selected and establishes transfer path from SBR to CAR.
Inputs I1 and I0 are identical to the bit values in the
BR field.
The bit values for S1 and S0 are determined from
the stated function and the path in the multiplexer
that establishes the required transfer.
The subroutine register is loaded with the
incremented value of CAR during a call
microinstruction (BR = 01) provided that the status
bit condition is satisfied (T = 1).
The truth table can be used to obtain the simplified
Boolean functions for the input logic circuit:
CentralProcessingUnit
The main part of the computer that performs the bulk of
data-processingoperationsiscalledthecentralprocessing
unit and is referred to as the CPU.
The CPU is made up of three major parts, as shown in Fig
1. Theregistersetstoresintermediatedatausedduringthe
execution of the instructions.
2. Thearithmeticlogicunit(ALU)performstherequired
microoperations for executing the instructions.
3. Thecontrolunitsupervisesthetransferofinformation
amongthe registers and instructs the ALU as towhich
operation to perform.
Generalregisterorganization
Memorylocationsareneededforstoringpointers,counters,returnaddress,temporaryresultsetc.
Referringtothesememorylocationsisverytimeconsumingbecausememoryaccessisthemost time
consuming operation in a computer.
Thereforeitisconvenienttostoretheseintermediatevaluesinprocessorregisters.
Whentherearemanyregistersinthesystemtheyareconnectedthroughacommonbussystem.
Theregisterscommunicatewitheachotherfordatatransferaswellasforperformingsomemicro operations.
Henceitis necessarytoprovideacommonunitthatperformsarithmetic,logicandshiftoperations in the
processor.
A bus organization for 7 CPU registers is shown in the figure.
The outputs of each register is connected to the two
multiplexers(MUX) to form the two buses A and B.
The selection lines in each multiplexer select one register or the
input data for the particular bus.
The A and B buses form the inputs to a common arithmetic logic
unit (ALU).
The operation selected in the ALU determines the arithmetic or
logic micro operation that is to be performed.
The result of micro operation goes into the inputs of all the
registers.
The register that receives the information is selected bya
decoder.
The decoder activates one of the register load inputs, thus
providing transfer path between the data in the output bus and
the inputs of the selected destination register.
The control unit that operates the CPU bus system directs the
information flow through the registers and ALU by selecting the
various components in the system.
Eg; to perform the following operation.
+
Thecontrolmustprovidebinaryselectionvariablesto the
following selector inputs.
1. MUXAselector(SELA):toplacethecontentsof
into bus A.
2. MUXBselector (SELB):toplacethecontentof into
bus B.
3. ALUoperationselector(OPR):toprovidethe
arithmetic addition A+B.
4. Decoderdestinationselector(SELD):totransferthe
content of the output bus into R1.
Thefourcontrolselectionvariablesaregeneratedinthecontrolunitandmustbeavailableatthe beginning
of a clock cycle.
Thedatafromthetwosourceregisterspropagatethroughthegatesinthemultiplexersandthe ALU, to the
outputbus,and onto the inputofthe destination register, all duringtheclock cycle interval.
Then,whenthenext clock transition occurs,thebinaryinformationfromtheoutputbusis transferred into
.
Toachieveafastresponsetime,theALUisconstructedwithhigh-speedcircuits. The buses
are implemented with multiplexers or three-state gates
Controlword
Control word is defined as a word whose individual bits represent various control signals.
Thereare14selectioninputsintheunit,andtheircombinedvaluespecifiesacontrolword. The 14 bit
control word is defined in the following fig, it consists of 4 fields.
Threefieldscontain3bitseachandlastfieldcontains5bits.
ThethreebitsofSELAselectasourceregisterfortheAinputoftheALU.
ThethreebitsofSELBselectasourceregisterfortheBinputoftheALU.
ThethreebitsofSELDselectadestination registerusingthedecoderanditsseven loadoutputs. The
five bits of OPR select one of the operations in the ALU.
The14bitcontrolwordwhenappliedtotheselectioninputsspecifyaparticular microoperation.
The encoding of the register selections is specified in table
The3-bitbinarycodelistedinthefirstcolumnofthetablespecifiesthe binary code
for each of the three fields.
The register selected by fields SELA, SELB, and SELD is the one whose decimal
number is equivalent to the binary number in the code. When SELA or SELB is
000, the corresponding multiplexer selects the external input data.
When SELD = 000, no destination register is selected but the contents of the
output bus are available in the external output.
TheALUprovidesarithmeticandlogicoperations.
The CPU must also provide shift operations. The shifter may be placed in the
inputoftheALUtoprovideapreshiftcapability,orattheoutputoftheALU to provide
postshifting capability.
Insomecases,theshiftoperationsareincludedwiththeALU.
The encoding ofthe ALU operations for the CPU is shown in the following
table.
TheOPRfieldhas 5bitsandeach operationisdesignatedwithasymbolic
name.
Examplesofmicrooperations
A control word of 14 bits is needed to specify a microoperation in the
CPU. The control word for a given microoperation can be derived from
the selection variables.
Forexample,thesubtractmicrooperationgivenbythestatement
R1 R2- R3
specifiesR2forthe AinputoftheALU,R3forthe BinputoftheALU,R1 for the
destination register, and an ALU operation to subtract A - B.
Thus the control word is specified by the four fields and the
corresponding binary value for each field is obtained from the encoding
listed in Tables 1 and 2.
Thebinarycontrolwordforthesubtractmicrooperationis
01001100100101 andisobtainedas follows:
Thecontrolwordforthismicrooperationandafewothersarelistedin Table 3.
The increment and transfer microoperations do not use the B input of the
ALU.
For these cases, the B field is marked with a dash. We assign 000 to any
unused field when formulating the binary control word, althoughany other
binary number may be used.
To place the content of a register into the output terminals we place the
content of the register into the A inputof the ALU, but none of the registers
are selected to accept the data.
TheALU operation TSFA placesthedatafrom the register,through the ALU, into
the output terminals.
Mostcomputersfallintooneofthethreetypesof organizations.
Somecomputerscombinefeaturesfrommorethanoneorganizationalstructure.
ToillustrateTheinfluenceofthenumberofaddressesoncomputerprograms,wewillevaluatethearithmetic statement
X=(A+B)*(C+D)
usingzero,one,two,orthreeaddressinstructions.
usingthesymbolsADD,SUB,MULandDIVforfourarithmeticoperations. MOV
for the transfer type operations;
LOADandSTOREfortransfertoandfrommemoryandACregister.
AssumingthattheoperandsareinmemoryaddressesA,B,C,andDandtheresult must bestoredinmemory at address
X and also the CPU has general purpose registers R1, R2, R3 and R4.
ThreeAddressInstructions:
Three-addressinstructionformatscanuseeachaddressfieldtospecifyeitheraprocessorregister or a
memory operand.
TheprogramassemblylanguagethatevaluatesX=(A+B)*(C+D)isshownbelow,togetherwith comments
that explain the register transfer operation of each instruction.
TheMOVinstructionmovesortransferstheoperandstoandfrommemoryandprocessorregisters.
Thefirst symbollisted inan instructionisassumedbebothasourceandthedestinationwheretheresult of the
operation transferred.
OneAddressInstructions:
One-addressinstructionsuseanimpliedaccumulator(AC)registerforalldatamanipulation. AC
contains the result of all operations.
TheprogramtoevaluateX=(A+B)*(C+D)is
AlloperationsaredonebetweentheACregisterandamemoryoperand.
Tistheaddressofatemporarymemorylocationrequiredforstoringtheintermediateresult.
ZeroAddress Instructions:
Astack-organizedcomputerdoesnotuseanaddressfieldfortheinstructionsADDandMUL.
ThePUSHandPOPinstructions,however,needanaddressfieldtospecifytheoperandthatcommunicates with
the stack.
ThefollowingprogramshowshowX=(A+B)*(C+D)willbewrittenforastack-organizedcomputer.(TOS stands for
top of stack).
Addressingmodes
Operandsarechosenduringprogramexecutiondependingontheaddressingmodeoftheinstruction.
Computers use addressing mode techniques to
1. Toprovidefacilitiessuchaspointerstomemory,countersforloopcontrol,indexingofdata,andprogramrelocation.
2. Toreducethenumberofbitsintheaddressingfieldoftheinstruction
Types of addressing modes
ImpliedMode
ImmediateMode
Register Mode
RegisterIndirectMode
AutoincrementorAutodecrementMode
Direct Address Mode
IndirectAddressMode
RelativeAddressMode
IndexedAddressingMode
BaseRegisterAddressingMode
Immediate Mode:
Inthismode theoperandisspecifiedintheinstructionitself.
In other words an immediate-mode instruction has an operand rather than an address field.
Immediate-mode instructions are useful for initializing registers to a constant value.
Register Mode:
Whentheaddressspecifiesaprocessorregister,theinstructionissaidtobeintheregistermode. In this
mode the operands are in registers that reside within the CPU.
The particular register is selected from a register field in the instruction.
RegisterIndirectMode:
Inthismodetheinstruction specifies aregisterinCPUwhosecontentsgivetheaddressoftheoperandin memory.
In other words, the selected register containsthe addressof the operandrather thantheoperanditself.
Theadvantageofaregisterindirectmodeinstructionisthattheaddressfield oftheinstructionusesfew bits to select
a register than would have been required to specify a memory address directly.
Auto-incrementorAuto-DecrementMode:
Thisissimilartotheregisterindirect modeexceptthattheregisterisincrementedordecremented after (or
before) its value is used to access memory.
Theaddressfieldofaninstructionisusedbythe controlunitintheCPUto obtaintheoperandfrom memory.
Sometimesthevaluegiven inthe addressfield istheaddressoftheoperand,butsometimes it isjustan address
from which the address of the operand is calculated.
ThebasictwomodeofaddressingusedinCPUaredirectandindirectaddressmode.
DirectAddressMode:
In this mode the effective address is equal to the address part of the instruction.
Theoperandresidesin memoryanditsaddressis givendirectlybythe address fieldoftheinstruction. In a
branch-type instruction the address field specifies the actual branch address.
IndirectAddressMode:
Inthismodetheaddressfieldoftheinstructiongivestheaddresswheretheeffective addressisstoredin memory.
Controlfetchestheinstructionfrommemoryandusesitsaddresspart toaccessmemoryagaintoreadthe effective
address.
Afewaddressingmodesrequirethattheaddressfield oftheinstructionbeaddedtothecontentofa specific
register in the CPU.
Theeffectiveaddressinthesemodesisobtainedfromthefollowingcomputation:
Effective address =address part of instruction + content of CPU register
TheCPUregisterusedinthecomputationmaybetheprogramcounter,anindexregister,orabaseregister. We have
a different addressing mode which is used for a different application.
RelativeAddressMode:
Inthismodethecontentoftheprogramcounterisaddedtotheaddresspartoftheinstructioninorderto obtain the
effective address.
IndexedAddressingMode:
Inthismodethecontentofanindexregisterisaddedtotheaddresspartoftheinstructiontoobtainthe effective address.
AnindexregisterisaspecialCPUregisterthatcontainsanindexvalue.
BaseRegisterAddressingMode:
Inthismodethe contentofabaseregister isaddedtotheaddresspartoftheinstructionto obtainthe effective
address.
Thisissimilar totheindexedaddressingmodeexceptthattheregister isnowcalledabaseregisterinsteadof an index
register.
Example
To show the differences between the various modes, we will
show the effect of the addressing modes on the instruction
defined in Fig
The two-word instruction at address 200 and 201 is a "load to
AC" instruction with an address field equal to 500.
The first word of the instruction specifies the operation code
and mode, and the second word specifies the address part.
PC has the value 200 for fetching this instruction. The content
of processor register R1 is 400, and the content of an index
register XR is 100.
ACreceivestheoperandaftertheinstructionisexecuted
In the direct address mode the effective address is the
address part of the instruction 500 and the operand to be
loaded into AC is 800.
In the immediate mode the second word of the instruction is
taken as the operand rather than an address, so 500 is loaded
into AC
In the indirect mode the effective address is stored in memory
at address 500. Therefore, the effective address is 800 and the
operand is 300.
In the relative mode the effective address is 500 + 202 =702
and the operand is 325. (thevalue in PC after the fetchphase
and during the execute phase is 202.)
In the index mode the effective address is XR+ 500 = 100 +
500 = 600 and the operand is 900.
In the register mode the operand is in R1 and 400 is loaded
into AC.
In the register indirect mode the effective address is 400,
equal to the content of R1 and the operand loaded into AC
is 700.
The auto-increment mode is the same as the register
indirect mode except that R1 is incremented to 401 after
the execution of the instruction.
The auto-decrement mode decrements R1 to 399 prior to
the execution of the instruction. The operand loaded into
AC is now 450.
DataTransferandManipulation
Mostcomputerinstructionscanbeclassifiedintothree
categories:
1. Datatransferinstructions
2. Datamanipulationinstructions
3. Programcontrolinstructions
Data Transfer Instructions:
Datatransferinstructionsmovedatafromoneplacein the
computer to another without changing the data content.
Themostcommontransfersarebetweenmemoryand
processor registers, between processor registers and
input or output, and between the processor registers
themselves.
Tablegivesalistofeightdatatransferinstructionsused in
many computers.
Data Manipulation Instructions:
Data manipulation instructions perform operations on data and
provide the computational capabilities for the computer.
The data manipulation instructions in a typical computer are usually
divided into three basic types:
1. Arithmeticinstructions
2. Logicalandbitmanipulationinstructions
3. Shift instructions
Arithmetic instructions
The four basic arithmetic operations are addition, subtraction,
multiplication and division.
Mostcomputersprovideinstructionsforallfouroperations.
Some small computers have only addition and possibly subtraction
instructions.
The multiplication and division must then be generated by mean
software subroutines.
AlistoftypicalarithmeticinstructionsisgiveninTable8-7.
Logicalandbitmanipulationinstructions
Logicalinstructionsperformbinaryoperationsonstringsof bits
store, registers.
Theyareusefulformanipulatingindividualbitsoragroupof that
represent binary-coded information.
Thelogicalinstructionsconsidereachbitoftheoperand
separately and treat it as a Boolean variable.
By proper application of the logical instructions it is possible
tochangebitvalues,toclearagroupofbits,ortoinsert new bit
values into operands stored in register memory words.
Sometypicallogicalandbitmanipulationinstructionsare listed in
Table.
ShiftInstruction
Shiftsareoperationsinwhichthebitsofawordare moved to
the left or right.
Thebitshiftedinattheendoftheworddetermines the
type of shift used.
Shift instructions may specify logical shifts,
arithmeticshifts,orrotate-typeoperations.
Ineithercasetheshiftmaybetotherightortothe left.
Table8-9listsfourtypesofshiftinstructions
Programcontrol
Programcontrolinstructionsspecifyconditionsforalteringthecontentof the
program counter.
Thechangein valueof theprogramcounterasaresultof theexecutionofa program
control instruction causes a break in the sequence of instruction execution.
Thisinstructionprovidescontrolovertheflowofprogramexecutionanda capability
for branching to different program segments.
Some typical program control instructions are listed in Table
Branchandjumpinstructionsmaybeconditionalorunconditional.
Anunconditionalbranchinstructioncausesabranchtothespecified address
without any conditions.
Theconditionalbranchinstructionspecifiesaconditionsuchasbranchif
positive or branch if zero.
Theskipinstructiondoesnotneedanaddressfieldandisthereforea zero- address
instruction.
A conditionalskipinstructionwillskipthenextinstructionif theconditionis met.
This is accomplished by incrementing program counter.
Thecallandreturninstructionsareusedinconjunctionwithsubroutines.
Thecompareinstructionformsasubtractionbetweentwooperands,but the
result of the operation not retained. However, certain status bit
conditions are set as a result of operation.
Similarly,thetestinstructionperformsthelogicalANDoftwooperandsand updates
certain status bits without retaining the result or changing the operands.
StatusBitConditions:
TheALUcircuitintheCPUhavestatusregisterforstoringthe status
bit conditions.
Statusbitsarealsocalledcondition-codebitsorflagbits.
FollowingFigureshowsblockdiagramofan8-bitALUwitha 4-bit
status register
ThefourstatusbitsaresymbolizedbyC,S,Z,andV.Thebits are set
or cleared as a result of an operation performed in the ALU.
BitC(carry)issetto1iftheendcarryC8is 1.Itis clearedto 0 if the
carry is 0.
S(sign)issetto1ifthehighest-orderbitF7is1.Itissetto0 if the bit
is 0.
BitZ(zero)is setto1iftheoutputoftheALUcontainsall0's. It is
clear to 0 otherwise. In other words, Z = 1 if the outputis
zero and Z =0 if the output is not zero.
BitV(overflow)issetto1iftheexclusive-ORofthelasttwo carries
equal to 1, and cleared to 0 otherwise.
Theabovestatusbitsareusedinconditionaljumpand branch
instructions.
Module3
DataRepresentation
Section3.1– DataTypes
Registerscontaineitherdataorcontrolinformation
Controlinformation isabit orgroupofbitsusedtospecifythesequenceof
command signals needed for data manipulation
Dataarenumbersandother binary-codedinformationthatareoperatedon
Possibledatatypesinregisters:
o Numbersusedincomputations
o Lettersofthe alphabetusedindata processing
o Otherdiscretesymbolsused forspecificpurposes
Alltypesofdata,exceptbinarynumbers,arerepresentedinbinary-coded form
Anumbersystemofbase,orradix,risasystemthat usesdistinctsymbolsforr
digits
Numbersarerepresentedbyastringofdigitsymbols
Thestringofdigits724.5representsthequantity 7 x
Thestringofdigits101101inthebinarynumber systemrepresentsthequantity 1 x 25
+ 0 x 24 + 1 x 23 + 1 x 22 + 0 x 21 + 1 x 20 = 45
(101101)2=(45)10
Wewillalsousetheoctal(radix8)andhexidecimal(radix16)numbersystems
(F3)16=Fx161+3x160=(243)10
Complementsareusedindigitalcomputersforsimplifyingsubtractionandlogical
manipulation
Twotypesofcomplementsforeachbase rsystem: r’scomplement and(r–1)’s
complement
Givenanumber N inbaserhavingndigits, the(r–1)’scomplementofN is
defined as (rn– 1) – N
For decimal,the9’scomplementofNis(10n–1)–N
The9’scomplement of546700is 999999– 546700=453299
The9’scomplement of453299is 999999– 453299=546700
Forbinary,the1’scomplementofNis(2n–1)–N
The1’scomplement of1011001is 1111111– 1011001=0100110
The1’scomplementisthetruecomplementofthenumber–justtoggleallbits
Subtractionofunsignedn-digitnumbers: M –N
o AddMtothe r’scomplementofN–thisresultsin
M+(rn–N)=M–N+rn
o IfMN,thesumwillproduce anendcarryrnwhichisdiscarded
o IfM<N,the sumdoesnotproduceanendcarryandisequalto
rn– (N – M), which is the r’s complement of (N – M). To obtain the answer
ina familiar form,takethe r’scomplementofthesumandplacea negative sign
in front.
Example:72532 –13250=59282.The10’scomplementof13250is86750.
M =72352
10’s comp. of N =
+86750Sum =159282
Discard end carry =-
100000Answer = 59282
= 13250
10’s comp. of N
=+27468Sum = 40718
Noendcarry
Answer =-59282(10’scomp.of40718)
X =1010100
2’scomp.ofY =+0111101
Sum =10010001
Discardendcarry =-10000000
AnswerX–Y = 0010001
Y = 1000011
2’scomp.ofX =+0101100
Sum =1101111
Noend carry
Answer =-0010001(2’s comp.of1101111)
Section 3.3–Fixed-PointRepresentation
Positive integersandzerocanberepresentedbyunsignednumbers
Negative numbersmust berepresentedbysignednumberssince+and –signsare not
available, only 1’s and 0’s are
Signednumbers havemsbas0for positiveand1fornegative–msbisthesignbit
Twowaystodesignatebinarypointposition ina register
o Fixedpointposition
o Floating-pointrepresentation
Fixedpointpositionusuallyusesoneofthetwofollowingpositions
o Abinarypointintheextremeleftoftheregistertomakeita fraction
o Abinarypointintheextremerightoftheregisterto makeitaninteger
o Inbothcases,abinarypointisnotactuallypresent
Thefloating-point representationsusesasecondregisterto designatetheposition of
the binary point in the first register
Consideran8-bitregisterandthenumber+14
o Theonlywaytorepresentitis00001110
Consideran8-bitregisterandthenumber–14
o Signed magnitude: 10001110
o Signed1’scomplement: 11110001
o Signed2’scomplement: 11110010
Typicallyusesigned2’scomplement
Subtractionoftwosigned2’scomplementnumbersisas follows
o Takethe2’scomplementformofthesubtrahend (including signbit)
o Addittotheminuend(includingthesignbit)
o Acarryoutofthesignbitpositionisdiscarded
Section3.4–Floating-Point Representation
Thefloating-pointrepresentationofanumberhastwoparts
Thefirstpartrepresentsasigned, fixed-pointnumber –themantissa
Thesecond partdesignatesthepositionofthebinarypoint–theexponent
Themantissamaybeafractionoraninteger
Example:thedecimalnumber+6132.789is
oFraction: +0.6123789
o Exponent: +04
o Equivalentto+0.6132789x 10+4
Afloating-pointnumberisalways interpretedtorepresentmxre
Example:thebinarynumber+1001.11(with8-bitfractionand6-bitexponent)
o Fraction:01001110
o Exponent: 000100
o Equivalentto+(.1001110)2x 2+4
Afloating-pointnumber issaidtobenormalizedifthemostsignificantdigitof the
mantissa is nonzero
Thedecimalnumber350isnormalized,00350is not
The8-bitnumber00011010isnotnormalized
Normalize itbyfraction=11010000andexponent=-3
Normalizednumbersprovidethe maximumpossibleprecisionforthefloating-
point number
Section3.5–OtherBinaryCodes
Digitalsystemscanprocessdataindiscreteformonly
Continuous,oranalog, informationisconvertedintodigitalformbymeansofan
analog-to-digital converter
Thereflectedbinaryor Graycode,issometimesusedfortheconverteddigital data
TheGraycodechangesbyonlyone bit asit sequencesfromonenumbertothe next
Graycodecountersaresometimesusedtoprovidethetimingsequencesthat
control the operations in a digital system
Binarycodesfordecimaldigitsrequireaminimumoffourbits
OthercodesbesidesBCDexisttorepresentdecimaldigits
The2421 codeandtheexcess-3 code arebothself-complementing
The9’scomplementofeachdigit isobtainedbycomplementingeachbit inthe code
The2421 codeisaweightedcode
Thebitsaremultipliedbyindicatedweightsandthesumgivesthedecimaldigit
Theexcess-3 codeisobtainedfromthecorresponding BCDcodeaddedto3
Section3.6–ErrorDetectionCodes
TheP(odd)bitischosentomakethesumof1’sinallfour bitsodd
Theeven-parityscheme hasthedisadvantageofhavingabit combinationofall 0’s
Procedureduringtransmission:
o Atthesendingend,themessage isappliedtoaparity generator
o Themessage,includingtheparitybit,istransmitted
o Atthereceiving end, alltheincomingbitsareappliedtoaparity checker
o Anyoddnumberoferrorsaredetected
ParitygeneratorsandcheckersareconstructedwithXORgates(oddfunction)
Anoddfunctiongenerates1iffanoddnumberifinputvariablesare1
COM
PUTE
ARIT
HME
TIC
Introduction:
Aprocessorhasanarithmeticprocessor(asasubpartofit)that
executes arithmetic operations. The data type, assumed to
reside in processor, registers during the execution of an
arithmetic instruction.Negative numbersmaybe inasigned
magnitude or signed complement representation. There are
three ways of representing negative fixed point - binary
numbers signed magnitude, signed 1’s complement or
signed 2’s complement. Most computers use the signed
magnitude representation for the mantissa.
AdditionandSubtraction:
AdditionandSubtractionwithSigned–MagnitudeData
Addition: A+B;A:Augend;B:Addend
Subtraction:A-B:A:Minuend;B:Subtrahend
AddM SubtractMagnitude
Operation agnitude WhenA>B WhenA<B WhenA=B
(+A)+(+B) +(A+B)
(+A)+(-B) +(A-B) -(B-A) +(A-B)
(-A)+(+B) -(A-B) +(B-A) +(A-B)
(-A)+(-B) -(A+B)
(+A)-(+B) +(A-B) -(B-A) +(A-B)
(+A)-(-B) +(A+B)
AVF ComplementerE
OutputCarry
ParallelAdder
S
M(ModeControl)
InputCarry
LoadSum
As ARegister
ComputerOrganization Prof.H.Yoon
ComputerArithmetic 3 AdditionandSubtraction
SIGNED2’SCOMPLEMENTADDITIONANDSUBTRACTION
Hardware
BRegister
AC
Algorithm
Subtract Add
ACAC+B’+1 ACAC+B
Voverflow Voverflow
END END
ComputerOrganization Prof.H.Yoon
Algorithm:
Iftheoutputofthegateis0thesignsare
identical; If it is 1, the signs are
different.
The two magnitudes are subtracted if the signs are different for an add
operation or identical for a subtract operation. The magnitudes are subtracted
by adding A to the2's complemented B. No overflow can occur ifthe numbers
are subtracted so AVF is cleared to 0.
1 in E indicates that A >= B and the number in A is the correct result. If this
numbs iszero, the sign A must be made positive to avoid a negative zero.
Inotherpathsoftheflowchart,thesignoftheresultisthesameasthesignof
A. so nochangein A is required.However,when A < B, the sign of the result is
the complement ofthe originalsignof A. It is then necessaryto complement A,
to obtain the correct sign.
ThefinalresultisfoundinregisterAanditssigninAs.ThevalueinAVF provides
anoverflow indication. The final value of E is immaterial.
Theadd-overflowflip-flopAVFholdstheoverflowbitwhenAandBare added.
TheAregisterprovidesothermicrooperationsthatmaybe neededwhenwe
specifythe sequence of steps in the algorithm.
MultiplicationAlgorithm:
Now, the low order bit of the multiplierin Qn is tested. If it is 1, the multiplicand (B) is added
to present partial product (A), 0 otherwise. Register EAQ is then shifted once tothe right
to form thenew partial product. The sequence counter is decremented by 1 and its new
value checked. If it is not equalto zero, the process is repeated and a new partial product
is formed. When SC = 0 we stops the process.
Booth’salgorithm:
Boothalgorithmgivesaprocedureformultiplyingbinaryintegersin
signed- 2’scomplement representation.
Itoperatesonthefact thatstringsof0’sinthemultiplierrequirenoadditionbutjust
shifting,andastringof1’sinthe
k+1 m
multiplierfrombit weight 2ktoweight 2m can
betreated as 2 – 2 .
Forexample,thebinarynumber001110(+14)hasastring1’sfrom
k+1 m 4 1
23to21
(k=3,m=1).Thenumbercanberepresentedas2 –2 .=2 –2 =16–2=
14. Therefore, the multiplication M X 14, where M is the multiplicand and 14
the multiplier, can be done as M X 24 – M X 21.
ThustheproductcanbeobtainedbyshiftingthebinarymultiplicandMfour times
tothe left and subtracting M shifted left once.
Asinallmultiplicationschemes,boothalgorithmrequiresexamination of
themultiplier bits and shifting of partial product.
2. Themultiplicandisaddedtothepartialproduct uponencounteringthefirst 0 in a
string of 0’s in the multiplier.
Thealgorithmworksforpositiveornegativemultipliersin2’s
complementrepresentation.
If the two bits are equal to 10, it means that the first 1 in a string of 1 's hasbeen
encountered. This requires a subtraction of the multiplicand from the partial
product inAC.
Ifthe two bits are equalto 01, it means that the first 0 in a string of0's has
beenencountered.Thisrequirestheadditionofthemultiplicand tothepartial
product inAC.
Whenthetwo bitsareequal,thepartialproductdoes not change.
DivisionAlgorithms
The devisor is compared with the five most significant bits of the dividend. Since
the 5-bit number is smaller than B, we again repeat the same process. Nowthe
6-bit number is greater than B, so we place a 1 for the quotient bit in the sixth
position above the dividend. Now we shift the divisor once to the right and
subtract it from the dividend. The difference is known as a partial remainder
becausethedivisioncouldhavestoppedheretoobtainaquotientof1and
aremainderequaltothepartial
remainder. Comparing a partial remainder with the divisor continues the process.If
the partial remainderis greater than orequal to the divisor, thequotientbitis
equal to
1. The divisor is thenshifted right and subtracted fromthe partialremainder. Ifthe
partial remainder is smaller than the divisor, the quotient bit is 0 and no
subtraction is needed. The divisor is shifted once to the right in any case.
Obviously the result gives both a quotient and a remainder.
HardwareImplementationforSigned-MagnitudeData
HardwareImplementationforSigned-MagnitudeData
Algorithm:
ExampleofBinaryDivisionwithDigitalHardware
Floating-pointArithmeticoperations:
In many high-level programming languages we have a facility for specifying floating-point
numbers. The most common way is by a real declaration statement. High level
programming languages must have a provision for handling floating-point arithmetic
operations.The operationsare generally built in the internal hardware. If no hardware is
available, the compiler must be designed with a package of floating-point software
subroutine. Although the hardware method is more expensive, it is much more efficient
than the software method. Therefore, floating- point hardware is included in most
computers and is omitted only in very small ones.
BasicConsiderations:
mxre
The mantissa maybe a fractionor an integer. The positionofthe radixpoint and the value of
theradixr arenot included intheregisters. For example, assumea fractionrepresentation and
a radix
10. The decimal number 537.25 is represented in a register with m = 53725 and e = 3 and is
interpreted to represent the floating-point number
.53725x103
A floating-point number is said to be normalized ifthe most significant digit ofthe mantissa in
nonzero. So the mantissa contains the maximum possible number ofsignificant digits. We
cannot normalize a zero because it does not have a nonzero digit. It is represented in
floating-point by all 0’s in the mantissa and exponent.
Floating-point representation increases the range of numbers for a given register. Consider a
computer with 48-bit words. Since one bit must be reserved for the sign, the range of
fixed-point integer numbers will be + (247 – 1), which is approximately + 1014. The 48
bits can be used to represent a floating-point number with36 bits for the mantissa and 12
bits for the exponent. Assuming fraction representation for the mantissa and taking the
two sign bits into consideration, the range of numbers that can be represented is
+ (1–2-35)x 22047
Computerswithshorterwordlengthsusetwoormorewordstorepresentafloating-point
number.An8-bit microcomputer uses four wordsto represent one floating-point number.
One word of 8 bits are reserved for the exponent and the 24 bits of the other three words
areused in the mantissa.
Arithmetic operations with floating-point numbers are more complicated than with fixed-
point numbers. Their execution also takes longer time and requires more complex
hardware. Adding or subtracting two numbers requires first an alignment of the radix
point since the exponent parts must be made equal before adding or subtracting the
mantissas. We do this alignment by shifting one mantissa while its exponent is adjusted
untilit becomes equal to the other exponent. Considerthe sumof the following floating-
point numbers:
.5372400x102
+.1580000x10-1
The operations done with the mantissas are the same as in fixed-point numbers, so the two
can share the same registers and circuits. The operations performed with the exponents
are compared and incremented (for aligning the mantissas), added and subtracted (for
multiplication) anddivision), and decremented (tonormalize the result).We canrepresent
the exponent in any one of the three representations - signed-magnitude, signed 2’s
complement or signed 1’s complement.
Biased exponents have the advantage that they contain only positive numbers. Now it
becomessimpler to compare their relative magnitude without bothering abouttheir signs.
Another advantage is that the smallest possible biased exponent contains all zeros. The
floating-point representation of zero is then a zero mantissa and the smallest possible
exponent.
RegisterConfiguration
The register configuration for floating-point operations isshown in figure 4.13.Asa rule,the
same registers and adder used for fixed-point arithmetic are used for processing the
mantissas.The difference lies in the way the exponents are handled.
The register organization for floating-point operations is shown in Fig. 4.13. Three registers
are there,BR,AC,andQR.Eachregisterissubdividedintotwoparts.Themantissa part has
thesame uppercase lettersymbols as in fixed-point representation. The exponent part may
usecorresponding lower-case letter symbol.
ComputerArithmetic 14 FloatingPointArithmetic
FLOATINGPOINTARITHMETICOPERATIONS
F=mxre
wherem:Mantissa
r:Radix
e:Exponent
RegistersforFloatingPointArithmetic
Bs B b BR
a AC
As A1 A
Qs Q q QR
ComputerOrganization Prof.H.Yoon
Figure4.13:RegistersforFloatingPointarithmeticoperations
In the similar way, register BR is subdivided into Bs, B, and b and QR into Qs, Q and q. A
parallel-adder adds the two mantissas and loads the sum into A and the carry into E. A
separate parallel adder can be used for the exponents. The exponents do not have adistrict
signbit becausetheyarebiased but arerepresentedasa biasedpositivequantity. It is assumed
that the floating- point number are so large that the chance of an exponent overflow is
very remote and so the exponent overflow will be neglected. The exponents are also
connected to a magnitude comparator that provides three binary outputs to indicate their
relative magnitude.
The number in the mantissa will be taken as a fraction, so they binary point is assumed to
resideto the left of the magnitude part. Integer representation for floating point causes
certainscaling problems during multiplicationand division. To avoid these problems, we
adopt a fractionrepresentation.
The numbers in the registers should initially be normalized. After each arithmetic operation,
the result will be normalized. Thus all floating-point operands are always normalized.
AdditionandSubtractionofFloating PointNumbers
During addition or subtraction, the two floating-point operands are kept in AC and BR. The
sum or difference is formed in the AC. The algorithm can be divided into four
consecutive parts:
1. Checkforzeros.
2. Alignthemantissas.
3. Addorsubtractthemantissas
4. Normalizetheresult
ComputerArithmetic 17 FloatingPointArithmetic
FLOATINGPOINTDIVISION
BR Divisor ACDividend
=0BR
0
=0 AC
0
QR0 QsAs+Bs
Q
0SCn-
divide by0 1
EAA+B’+1
1 E0
A<B
A>=B
AA+B AA+B
shr A
aa+1
a a+b’+1 aa+bias q a
ComputerOrganization Prof.H.Yoon
UNIT-IV 1
KNREDDY
UNIT-IV
MEMORYANDINPUT/OUTPUTORGANIZATION
MemoryOrganization:
MemoryHierarchy
Main Memory
Auxiliary Memory
AssociativeMemory
Cache Memory
Virtual Memory.
Input/outputOrganization
: Input-Output
Interface
AsynchronousDataTransfer
Modes of Transfer
Priority Interrupt
Direct Memory Access (DMA).
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 2
KNREDDY
MEMORYHIERARCHY
The memory unitis an essential component in any digital computer since it is needed
forstoring programs and data. A very small computer with a limited application may be able to
fulfill its intended task without the need of additional storage capacity.
Most general-purpose computers would run more efficiently if they were equipped with
additional storage beyond the capacity of the main memory.
It is more economical to use low-cost storage devices to serve as a backup for storing the
information that is not currently used by the CPU.
The memory unit that communicates directly with the CPU is called the main
memory.Devicesthat provide backup storage arecalledauxiliary memory. The most
commonauxiliary memory devices used in computer systems are magnetic disks and tapes.
They are used for storing system programs, large data files, and other backup information.
Only programs anddata currently needed by the processor reside in main memory. All other
information is storedin auxiliary memory and transferred to main memory when needed.
The memory hierarchy system consists of all storage devices employed in a computer system
from the slow but high-capacity auxiliary memory to a relatively faster main memory, to an
even smaller and faster cache memory accessible to the high-speed processing logic.
Memoryhierarchyincomputersystem
The main memory occupies a central position by being able to communicate directly with the
CPU and with auxiliary memory devices through an I/O processor.
When programs not residing in main memoryare needed bythe CPU, theyare brought in from
auxiliarymemory.Programs not currentlyneeded in mainmemoryaretransferred intoauxiliary
memory to provide space for currently used programs and data.
A special very-high speed memory called a cache is sometimes used to increase the speed of
processing bymaking current programs and data available tothe CPU at arapid rate. Thecache
memory is employed in computer systems to compensate for the speed differential between
main memory access time and processor logic.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 3
KNREDDY
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 4
KNREDDY
MAINMEMORY
The main memory is the central storage unit in a computer system. It is a relatively large and
fast memory used to store programs and data during the computer operation.
The principal technology used for the main memory is based on semiconductor integrated
circuits.
Integrated circuit RAM chips are available in two possible operating modes, static and
dynamic. The static RAM consists essentially of internal flip-flops that store the binary
information. The stored information remains valid as long as power is applied to unit. The
dynamic RAM stores the binary information in the form of electric charges that are applied to
capacitors. The capacitors are provided inside the chip by MOS transistors. The stored
chargeonthe capacitorstendsto discharge withtime and thecapacitors must be
periodicallyrecharged by refreshing the dynamic memory.
The dynamic RAM offers reduced power consumption and larger storage capacity in a single
memory chip.
ThestaticRAMiseasierto useandhasshortedreadandwritecycles.
Most ofthe main memoryin a general-purpose computer is made up ofRAM integrated circuit
chips, but a portion of the memory may be constructed with ROM chips.
RAM refers to a random-access memory, but it is used to designate a read/write memory to
distinguish it from a read-only memory, although ROM is also random access.
RAM is used for storing the bulk of the programs and data that are subject to change. ROM is
used for storing programs that are permanently resident in the computer
The ROM portion of main memory is needed for storing an initialprogram called a bootstrap
loader. The bootstrap loader is a program whose function is to start the computer software
operating when power is turned on.
Since RAM is volatile, its contents are destroyed when power is turned off. The contents of
ROM remain unchanged after power is turned off and on again.
The startup of a computer consists of turning the power on and starting the execution of an
initial program. Thus when power is turned on, the hardware of the computer sets the program
counterto the first addressofthe bootstrap loader. The bootstrap programloads a portion ofthe
operating system from disk to main memory and control is then transferred to the operating
system, which prepares the computer for general use.
RAMandROMchipsareavailable ina varietyofsizes. Ifthe memoryneeded for thecomputer is
larger than the capacity ofone chip, it is necessaryto combine a number ofchips to formthe
requiredmemorysize.Ex: 1024 ×8 memorycan be constructed with 128 ×8 RAM chips and
512 ×8 ROM chips.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 5
KNREDDY
RAMANDROMCHIPS
A RAM chip is better suited for communication with the CPU if it has one or more control
inputs that select the chip only when needed. Another common feature is a bidirectional
databus that allows the transfer ofdata either from memoryto CPU during a read operation or
from CPU to memory during a write operation.
Abidirectionalbuscanbeconstructedwiththree-statebuffers.
TheblockdiagramofaRAMchip isshownin Fig.
The capacityofthe memory is128 wordsofeight bits(one byte) per word.Thisrequiresa 7-bit
address and an 8-bit bidirectional data bus. The read and write inputs specify the memory
operation and the two chips select (CS) control inputs are for enabling the chip only when it is
selected by the microprocessor. The availability of more than one control input to select
thechip facilitates the decoding of the address lines when multiple chips are used in the
microcomputer.
The read and write inputs are sometimes combined into one line labeled R/W. Whenthe chip is
selected, the two binary states in this line specify the two operations or read or write.
TheunitisinoperationonlywhenCS1=1andCS2=0.
Ifthe chip select inputs are not enabled, or ifthey are enabled but the read but the read or write
inputs are not enabled, the memory is inhibited and its data bus is in a high-impedance state.
When CS1 = 1 and CS2 = 0, the memory can be placed in a write or read mode. When the WR
input is enabled, the memory stores a byte from the data bus into a location specified by the
address input lines.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 6
KNREDDY
When the RD input is enabled, the content of the selected byte is placed into the data bus. The
RD and WR signals controlthe memoryoperation as wellas the bus buffers associated withthe
bidirectional data bus.
A ROM chip is organized externally in a similar manner. ROM can only read, the data bus can
only be in an output mode. The block diagram of a ROM chip is shown in Fig.
The nine address lines in the ROM chip specify any one of the 512 bytes stored in it. The two
chip select inputs must be CS1 = 1 and CS2 = 0 for the unit to operate. Otherwise, the data bus
is in a high-impedance state. There is no need for a read or write control because the unit can
only read. Thus when the chip is enabled by the two select inputs, the byte selected by the
address lines appears on the data bus.
MEMORYADDRESS MAP
The designer of a computer system must calculate the amount of memory required for the
particular application and assign it to either RAM or ROM. The interconnection between
memory and processor is then established form knowledge of the size of memory needed and
the type of RAM and ROM chips available.
A memory address map, is a pictorial representation of assigned address space for each chip in
the system.
To demonstrate with a particular example, assume that a computer system needs 512 bytes of
RAM and 512 bytes of ROM.
ThememoryaddressmapforthisconfigurationisshowninTable.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 7
KNREDDY
The small x’s under the address bus lines designate those lines that must be connected to the
address inputs in each chip.
TheRAMchips have128 bytes andneedsevenaddress lines. TheROMchip has512 bytesand
needs 9 address lines.
It is now necessary to distinguish between four RAM chips by assigning to each a different
address. For this particular example we choose bus lines 8 and 9 to represent four
distinctbinary combinations.
The distinction between a RAM and ROM address is done with another bus line. Here we
choose line 10 for this purpose. When line 10 is 0, the CPU selects a RAM, and when this
lineis equal to 1, it selects the ROM.
The first hexadecimal digit represents lines 13 to 16 and is always 0. The next
hexadecimaldigit represents lines 9 to 12, but lines 11 and 12 are always 0. The range of
hexadecimal addresses foreachcomponent isdetermined fromthe x’s associated with it. These
x’s represent a binary number that can range from an all-0’s to an all-1’s value.
MEMORYCONNECTIONTOCPU
RAMandROMchipsareconnectedto aCPUthroughthedataandaddressbuses.
The low-order lines in the address bus select the byte within the chips and other lines in the
address bus select a particular chip through its chip select inputs.
The connection of memory chips to the CPU is shown in Fig. This configuration gives a
memory capacity of 512 bytes of RAM and 512 bytes of ROM.
Each RAM receives the seven low-order bits of the address bus to select one of 128 possible
bytes. The particular RAM chip selected is determined from lines 8 and 9 in the address bus.
Thisisdone through a 2 ×4 decoder whose outputs go to the CS1 input in each RAM chip.
Thus, when address lines 8 and 9 are equal to 00, the first RAM chip is selected. When 01, the
second RAM chip is selected, and so on.
TheRDand WRoutputsfromthemicroprocessorareappliedtotheinputsofeachRAMchip.
The selectionbetweenRAM and ROM is achievedthroughbus line 10. The RAMs are selected
when the bit in this line is 0, and the ROM when the bit is 1. The other chip select input in the
ROM is connected to the RD control line for the ROM chip to be enabled only during a read
operation.
Address bus lines 1 to 9 are applied to the input address of ROM without going through the
decoder. This assigns addresses 0 to 511 to RAM and 512 to 1023 to ROM.
The data bus of the ROM has only an output capability, whereas the data bus connected to the
RAMs can transfer information in both directions.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 8
KNREDDY
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 9
KNREDDY
AUXILIARYMEMORY:
The most common auxiliary memorydevices used in computer systems are magnetic disks and
magnetic tapes. Other components used, but not as frequently, are magnetic drums, magnetic
bubble memory, and optical disks.
The important characteristics of any device are its access mode, access time, transfer rate,
capacity, and cost.
The average time required to reach a storage location in memory and obtain its contents
iscalled the access time. The access time consists of a seek time required to position the read-
write head to a location and a transfer time required to transfer data to or fromthe device.
Auxiliarystorageisorganized inrecordsorblocks.Arecord isaspecified number ofcharacters or
words. Reading or writing is always done on entire records. The transfer rate is the
numberofcharacters or wordsthatthe device cantransfer per second, after it has beenpositioned
at the beginning of the record.
Magnetic drums and disks are quite similar in operation. Both consist of high-speed rotating
surfaces coated with a magnetic recording medium. The rotating surface of the drum is a
cylinder and that of the disk, a round flat plate. Bits are recorded as magnetic spots on the
surface as it passes a stationary mechanism called a write head. Stored bits are detected by a
change in magnetic field produced by a recorded spot on the surface as it passes through a
read head.
MAGNETICDISKS
A magnetic disk is a circular plate constructed of metal or plastic coated with magnetized
material. Often both sides ofthe disk are used and several disks may be stacked onone spindle
with read/write heads available on each surface.
Alldisksrotatetogetherat highspeedandarenotstoppedorstartedfromaccesspurposes.
Bits are stored in the magnetized surface in spots along concentric circles called tracks. The
tracks are commonly divided into sections called sectors. In most systems, the minimum
quantity of information which can be transferred is a sector.
Some units use a single read/write head from each
disk surface. The track address bits are used by a
mechanical assembly to move the head into the
specified track position before reading or writing.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 10
KNREDDY
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 11
KNREDDY
ASSOCIATIVEMEMORY
Many data-processing applications require the search of items in a table stored in memory. An
assembler program searches the symbol address table in order to extract the symbol’s binary
equivalent.
The number ofaccesses to memorydepends onthe locationofthe itemand the efficiencyofthe
search algorithm. Many search algorithms have been developed to minimize the number of
accesses while searching for an item in a random or sequential access memory.
The time required to find an item stored in memory can be reduced considerably if stored data
can be identified for access bythe content of the data itself rather than by an address.
A memory unit accessed by content is called an associative memory or content addressable
memory (CAM).
When a word is to be read from an associative memory, the content of the word, or part of the
word, is specified. The memory locates allwords which match the specified content and marks
them for reading.
An associative memory is more expensive than a random access memory because each
cellmust have storage capability as well as logic circuits for matching its content with an
external argument. For this reason, associative memories are used in applications where the
search time is very critical and must be very short.
HARDWAREORGANIZATION
The block diagramofanassociative memoryis shown
in Fig.
It consists of a memory array and logic for m words
withn bits per word. The argument register Aand key
register Keach have n bits, one for eachbit ofaword.
The match register M has m bits, one for
eachmemory word.
Eachwordin memoryis compared inparallelwiththe
content of the argument register. The words
thatmatch the bits of the argument register set a
corresponding bit in the match register.
After the matching process, those bits in the match register that have been set indicate the fact
that their corresponding words have been matched.
Reading is accomplished by a sequential access to memory for those words whose
corresponding bits in the match register have been set.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 12
KNREDDY
The key register provides a mask for choosing a particular field or key in the argument word.
The entire argument is compared with each memory word if the key register contains all 1’s.
Otherwise, only those bits in the argument that have 1’s in their corresponding position of the
key register are compared.
To illustrate with a numerical example, suppose that the argument register A and the key
register K have the bit configuration shown below. Only the three leftmost bits of A are
compared with memory words because K has 1’s in these positions.
A101111100
K111000000
Word1100111100nomatch
Word2101000001match
Therelationbetweenthememoryarrayandexternalregistersinanassociativememoryis shown in
Fig.
The cells in the array are marked by the letter C
with two subscripts. The first subscript gives the
word number and the second specifies the bit
position in the word. Thus cellCij is the cell for
bitj in word i.
A bit Aj in the argument register is compared with
allthe bits incolumn jofthe arrayprovided that Kj
=1.Thisisdoneforallcolumnsj=1,2,…,n.
If a match occurs between all the unmasked bits of the argument and the bits in word i, the
corresponding bit Mi in the match register is set to 1. If one or more unmasked bits of the
argument and the word do not match, Mi is cleared to 0
The internalorganizationofatypicalcellCijis showninFig.
Itconsists of a flipflop storage element Fij and the
circuits for reading, writing, and matching the cell.
The input bit is transferred into the storage cell
during a write operation. The bit stored is read out
during a read operation.
The match logic compares the content of the
storage cell with the corresponding unmasked bit of
the argument and provides an output for the
decision logic that sets the bit in Mi.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 13
KNREDDY
MATCHLOGIC
The match logic for each word can be derived from the comparison algorithm for two binary
numbers. First, we neglect the key bits and compare the argument in A with the bits stored
inthe cells of the words. Word i is equal to the argumentin A if Aj = Fij forj = 1, 2,…, n. Two
bitsareequal iftheyare both1 or both0. The equalityoftwo bitscan be expressed logically by the
Boolean function
xj=AjFij +A'jF'ij
wherexj=1ifthepairofbits inpositionjareequal;otherwise,xj=0.
ForaworditobeequaltotheargumentinAwemusthaveallxjvariablesequalto1.
This istheconditionforsettingthecorresponding matchbit Mito1.TheBooleanfunctionfor this
condition is
Mi=x1x2x3…xn
Include the key bit Kj in the comparison logic. The requirement is that if Kj = 0,the
corresponding bits of Aj and Fij need no comparison. Only when Kj = 1 must they
becompared. This requirement is achieved by ORing each term with K’j , thus:
xj+K’j= xj ifKj=1
1 ifKj=0
WhenKj =1,wehaveKj’=0andxj +0=xj.WhenKj =0,then Kj’=1xj +1=1.A term (xj +Kj’) will be
in the 1 state if its pair of bitsis not compared. This is necessary because each term is ANDed
with all other terms so that an output of 1 will have noeffect. The comparisonof the bits has an
effect only when Kj = 1.
The match logic for word i in an associative memory can now be expressed by the following
Boolean function:
Mi=(x1+K'1)(x2+K '2)(x3+K'3)….(xn+K'n)
Each term in the expression will be equal to 1 if its corresponding Kj = 0. If Kj = 1, the term
will be either 0 or 1 depending on the value ofxj. A match will occur and Miwill be equalto 1
if all terms are equal to 1.
Ifwesubstitutetheoriginaldefinitionofxj. theBoolean functionabovecan beexpressed as
Mi=∏(AjFij+A′jF′ij+K′j)
follows:
j=1
The circuit for matching one word is shown in Fig. Each cell requires two AND gates and one
OR gate. The inverters for Aj and Kj are needed once for each column and are used for all bits
in the column. The output of all OR gates in the cells of the same word go to the input of a
common AND gate to generate the match signal for Mi. Mi will be logic 1 if a match
occursand0ifnomatchoccurs.Notethatifthekeyregistercontainsall0’s,outputMiwillbea1
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 14
KNREDDY
READOPERATION
If more than one word in memory matches the
unmasked argument field, all the matched words
willhave1’sinthecorrespondingbitpositionof
the match register. It is then necessary to scan the bits of the match register one at a time. The
matched words are read in sequence by applying a read signal to each word line whose
corresponding Mi bit is a 1.
In most applications, the associative memory stores a table with no two identical items under a
given key. Inthis case, onlyone word may match the unmasked argument field. Byconnecting
output Mi directly to the read line in the same word position (instead of the M register), the
content of the matched word will be presented automatically at the output lines and no
specialread command signal is needed.
WRITEOPERATION
Anassociativememorymusthaveawritecapabilityforstoringtheinformationtobesearched.
Writing in an associative memory can take different forms, depending on the application. Ifthe
entire memory is loaded with new information at once prior to a search operation then the
writing can be done by addressing each location in sequence. This will make the device a
random-access memory for writing and a content addressable memory for reading. The
advantage here is that the address for input can be decoded as in a random-access
memory.Thus instead of having m address lines, one for each word in memory, the number of
address lines can be reduced by the decoder to d lines, where m = 2d.
Ifunwantedwordshavetobedeletedand newwords insertedoneatatime, there isa need for a
special register to distinguish between active and inactive words. This register,
sometimescalled a tag register.
For every active word stored in memory, the corresponding bit in the tag register is set to 1. A
word is deleted from memory by clearing its tag bit to 0.
Wordsarestoredinmemorybyscanningthetagregister untilthe first 0bit isencountered.This gives
the first available inactive word and a position for writing a new word. After the newword is
stored in memory it is made active by setting its tag bit to 1. An unwanted word when deleted
from memorycanbe cleared to all 0’s ifthis value is used to specifyanemptylocation.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 15
KNREDDY
CACHEMEMORY
Locality of Reference: The references to memory at any given time interval tends to
beconfined within a localized area.
When a program loop is executed, the CPU repeatedly refers to the set of instructions in
memory that constitute the loop.
Every time a given subroutine is called, its set of instructions is fetched from memory. Thus
loops and subroutines tend to localize the references to memory for fetching instructions.
Iterative procedures refer to common memory locations and array of numbers are confined
within a local portion of memory
If the active portions of the program and data are placed in a fast small memory, the average
memory access time can be reduced, thus reducing the total execution time of the
program.Sucha fast small memoryis referred to as a cache memory. The cache is the fastest
component in the memory hierarchy and approaches the speed of CPU components.
When the CPU needs to access memory, the cache is examined. If the word is found in the
cache, it is read from the fast memory. If the word addressed by the CPU is not found in the
cache, the main memory is accessed to read the word. A block ofwords containing the one just
accessed is then transferred from main memory to cache memory
Theperformanceofcachememoryisfrequentlymeasuredintermsofaquantitycalled hit ratio.
When the CPU refers to memory and finds the word in cache, it is said to produce a hit. Ifthe
word is not found incache, it is in main memoryand it counts as a miss. The ratio of
thenumberofhitsdividedbythetotalCPUreferencestomemory(hitsplusmisses)isthe hit ratio.
The average memoryaccesstime ofa computer systemcanbe improved considerablybyuse of a
cache.
The transformation of data from main memory to cache memory is referred to as a mapping
process. Three types of mapping procedures are :
1. Associativemapping
2. Directmapping
3. Set-associativemapping.
Considerthefollowingmemoryorganization:
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 16
KNREDDY
ASSOCIATIVEMAPPING
The faster and most flexible cache organization use an associative memory. The associative
memory stores both the address and content (data) of the memory word. This permits any
location in cache to store any word from main memory.
A CPU address of 15 bits is placed in the argument register
and the associative memory is searched for a matching address.
Ifthe address is found, the corresponding 12-bit data is read
and sent to the CPU.
If no match occurs, the main memory is accessed for the word.
The address-data pair is thentransferred tothe associative cache
memory. If the cache is full, an address−data pair must be
displaced to make room for a pair that is needed and not
presently in the cache.
The decision as to what pair is replaced is determined from the replacement algorithm that the
designer chooses for the cache. A simple procedure is to replace cells of the cache in round-
robin order whenever a new word is requested from main memory. This constitutes a first-in
first-out (FIFO) replacement policy.
DIRECTMAPPING
Associative memories are expensive compared to random-access memories because of the
added logic associated with each cell.
DirectmappingusesRAMinsteadofCAM.
The n-bit memory address is divided into two
fields: k bits for the index field andn-k bits forthe
tag field. The direct mapping cache organization
uses the n-bit address to access the main memory
and the k-bit index to access the cache.
The internal organization of the words in the cache
memory is as shown in Fig
Each word in cache consists of the data word and its
associated tag. When a new word is first brought
intothe cache, the tag bits are stored alongside the data
bits. When the CPU generates a memory request, the
index field is used for the address to access the cache.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 17
KNREDDY
The tag field of the CPU address is compared with the tag in the word read from the cache. If
the two tags match, there is a hit and the desired data word is in cache. If there is no
match,there is a miss and the required word is read from main memory. It is then stored in the
cache together with the new tag, replacing the previous value.
The disadvantage of direct mapping is that the hit ratio can drop considerably if two or more
words whose addresses have the same index but different tags are accessed repeatedly.
Suppose that the CPU now wants to access the word at address 02000. The index address
is000, so it is sued to access the cache. The two tags are then compared. The cache tag is 00
but the address tag is 02, whichdoes not produce a match. Therefore,the main memory is
accessed and the data word 5670 is transferred to the CPU. The cache word at index address
000 is then replaced with a tag of 02 and data of 5670.
Thedirect-mappingusesablocksizeofoneword.Thesameorganizationbut usinga blocksize of 8
words is shown in Fig.
The index field is now divided into two parts: the
block field and the word field. The tag field stored
within the cache is common to all eight words of the
same block.
Everytime a miss occurs, anentire block ofeight words
mustbetransferredfrommainmemorytocachememory.Al
thoughthistakesextratime,thehitratio
willmostlikelyimprovewithalargerblocksize because of
the sequential nature of computer programs. SET-
ASSOCIATIVE MAPPING
Set-associative mapping is an improvement over the direct-mapping organization in that
eachword of cache can store two or more words of memory under the same index address.
Each data word is stored together with its tag and the number
of tag-data items in one word of cache is said to form a set.
Each index address refers to two data words and their
associated tags.Eachtag requires six bitsand
eachdatawordhas12 bits, so the word length is 2(6 + 12) = 36
bits. An index address of nine bits can accommodate 512
words. Thus the size of cachememory is 512 ×36. It can
accommodate 1024
Thewordsstored at addresses01000 and 02000 of main memoryarestored incache memoryat
index address 000. Similarly, the words at addresses 02777 and 00777 are stored in cache at
index address 777.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 18
KNREDDY
Whenthe CPU generates a memoryrequest, the index value ofthe address is used to access the
cache. The tag field of the CPU address is then compared with both tags in the cache to
determine if a match occurs.
The hit ratio will improve as the set size increases because more wordswiththe same indexbut
different tags can reside in cache.
Whena missoccurs inaset-associativecacheandtheset is full, it isnecessarytoreplaceoneof the
tag-data items with a new value. The most common replacement algorithms used are: random
replacement, first-in first out (FIFO), and least recently used (LRU).
WRITINGINTOCACHE
An important aspect of cache organization is concerned with memory write requests. If the
operation is a write, there are two ways that the system can proceed.
The simplest and most commonly used procedure is to up data main memory with every
memorywrite operation, with cache memory being updated in parallel if it contains theword at
the specified address. This is called the write-through method. This method has the advantage
that main memory always contains the same data as the cache,. This characteristic is important
in systems with direct memory access transfers.
Thesecondprocedureiscalledthewrite-back method. Inthis methodonlythecache locationis
updated during a write operation. The location is then marked by a flag so that later when the
wordsareremoved fromthecache it iscopied into mainmemory. Thereasonforthewrite-back
method is that during the time a word resides in the cache, it may be updated several times;
however, as long as the word remains inthe cache, it does not matter whether the copyin main
memory is out of date, since requests from the word are filled from the cache. It is only when
thewordisdisplaced fromthecachethat anaccuratecopyneed berewrittenintomainmemory.
CACHE INITIALIZATION
The cache is initialized when power is applied to the computer or when the main memory is
loaded witha complete set ofprograms fromauxiliarymemory. After initializationthe cache is
considered to be empty, built in effect it contains some non-valid data. It is customary
toinclude with each word in cache a valid bit to indicate whether or not the word contains
valid data.
The cache is initialized by clearing all the valid bits to 0. The valid bit of a particular
cacheword is set to 1 the first time this word is loaded from main memory and stays set unless
the cache has to be initialized again. The introduction ofthe valid bit means that a word
incache is initialization condition has the effect of forcing misses from the cache until it fills
with valid data.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 19
KNREDDY
VIRTUALMEMORY
In a memory hierarchy system, programs and data are brought into main memory as they are
needed by the CPU.
Virtual memory is a concept used in some large computer systems that permit the user to
construct programs as though a large memory space were available, equal to the totality of
auxiliary memory.
A virtual memory system provides a mechanism for translating program-generated addresses
into correct main memory locations. This is done dynamically, while programs are being
executed in the CPU. The translation or mapping is handled automatically by the hardware by
means of a mapping table.
ADDRESSSPACEANDMEMORYSPACE
An address used by a programmer will be called a virtualaddress, and the set ofsuch addresses
the address space.
Anaddress in main memoryis called a locationorphysicaladdress. The set ofsuch locations is
called the memory space.
In most computers the address and memory spaces are identical. The address space is
allowedto be larger than the memory space in computers with virtual memory.
As an illustration, consider a computer with a main-memorycapacityof32K words (K =1024).
Fifteen bits are needed to specify a physical address in memory since 32K = 2 15. Suppose that
the computer has available auxiliary memory for storing 220 = 1024K words. Thus auxiliary
memory has a capacity for storing information equivalent to the capacityof32 main memories.
DenotingtheaddressspacebyNand thememoryspacebyM,wethen haveforthisexampleN
=1024KandM=32K.
In a multiprogram computer system, programs and data are transferred to and from auxiliary
memory and main memory based on demands imposed by the CPU. Suppose that program 1 is
currently being executed in the CPU. Program 1 and a portion of its associated data is moved
from auxiliary memory into main memory as shown in Fig.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 20
KNREDDY
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 21
KNREDDY
Note that the line address in address space and memory space is the same; the only mapping
required is from a page number to a block number
TheorganizationofthememorymappingtableinapagedsystemisshowninFig.
The memory-page table consists of
eight words, one for each page. The
address in the page table denotes the
page number and the content of the
word gives the block number where that
page is storedin main memory. The
table shows that
pages1,2,5and6arenowavailablein
main memory in blocks 3, 0, 1, and 2,
respectively. A presence bit in each
location indicates whether the page has
been transferred from auxiliary memory
intomainmemory.A0inthepresence
bit indicates that this page is not available in main memory. The CPU references a word in
memorywitha virtualaddress of13 bits. The three high-order bits ofthe virtualaddress specify a
page number and also an address for the memory-page table. The content of the word in the
memorypagetableat thepagenumber address isreadout intothe memorytable buffer register. If
the presence bit is a 1, the block number thus read is transferred to the two high-order bits of
the main memory address register. The line number from the virtual address is transferred into
the 10 low order bits of the memory address register. A read signal to main memory transfers
the content ofthe word to the main memory buffer register ready to be used by the CPU. If the
presence bit in the word read from the page table is 0, it signifies that the content of the word
referenced by the virtual address does not reside in main memory. A call to the
operatingsystem is then generated to fetch the required page from auxiliary memory and place
it into main memory before resuming computation.
ASSOCIATIVEMEMORYPAGETABLE
Arandom-accessmemorypagetableisinefficientwithrespecttostorageutilization.
Ingeneral, systemwithnpages and mblocks would require a memorypage table ofn locations of
which up to mblocks will be marked with block numbers and allothers will be empty.
Consider an address space of 1024K words and memory space of 32K words. If each page or
block contains 1K words, the number of pages is 1024 and the number of blocks 32. The
capacity of the memory-page table must be 1024 words and only 32 locations may have a
presence bit equalto 1. At anygiven time, at least 992 locations will be emptyand not in use.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 22
KNREDDY
A more efficient way to organize the page table would be to construct it with a number
ofwords equal to the number of blocks in main memory.
This method can be implemented by means of an associative memory with each word in
memory containing a page number together with its corresponding block number. The
pagefield in each word is compared with the page number in the virtual address. If a match
occurs, the word is read from memory and its corresponding block number is extracted.
Considerthecaseofeightpagesandfourblocks.
Each entry in the associative memory array consists of
two fields. The first three bits specify a field for storing
the page number. The last two bits constitute a field for
storing the block number. The virtual address is placed
in the argument register. The page number bits in the
argument are compared with all page numbers in the
page field of the associative memory. If the
pagenumber is found, the 5-bit word is read out from
memory. The corresponding block number, being in
the sameword,istransferredtothemainmemory address
register. If no match occurs, a call to the operating system is generated to bring the required
page from auxiliary memory.
PAGEREPLACEMENT
A virtual memory system is a combination of hardware and software techniques. The memory
management software system handles all the software operations for the efficient utilization of
memory space. It must decide (1) which page in main memory ought to be removed to make
room for a new page, (2) when a new page is to be transferred from auxiliary memory to main
memory, and (3) where the page is to be placed in main memory.
The hardware mapping mechanism and the memory management software together constitute
the architecture of a virtual memory.
When a program starts execution, one or more pages are transferred into main memory and the
page table is set to indicate their position. The program is executed from main memory until it
attempts to reference a page that is still in auxiliary memory. This condition is called
pagefault. When page fault occurs, the execution of the present program is suspended until
the required page is brought into main memory. Since loading a page from auxiliary memory
to main memory is basically an I/O operation, the operating system assigns this task to the I/O
processor.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 23
KNREDDY
In the meantime, controls transferred to the next program in memory that is waiting to be
processed in the CPU. Later, when the memory block has been assigned and the transfer
completed, the original program can resume its operation.
When a page fault occurs in a virtual memory system,it signifies that the page referenced
bythe CPUis not in main memory. Anew page is thentransferred fromauxiliarymemoryto main
memory. If main memoryis full, it would be necessaryto remove a page froma memoryblock
to make room for the new page. The policy for choosing pages to remove is determined
fromthe replacement algorithm that is used.
Two of the most common replacement algorithms used are the first-in first-out (FIFO) and the
least recentlyused(LRU). TheFIFOalgorithmselectsfor replacement thepagethathasbeenin
memorythe longest time. Each time a page is loaded into memory, its identification number is
pushed into a FIFO stack. FIFO will be full whenever memory has no more empty
blocks.Whena new page must be loaded,the pageleast recentlybrought in is removed. The
pageto be removed is easily determined because its identification number is at the top of the
FIFO stack. The FIFO replacement policy has the advantage of being easy to implement. It
has the disadvantages that under certain circum-stances pages are removed and loaded form
memory too frequently.
The LRU policy is more difficult to implement but has been more attractive on the assumption
that the least recentlyused page is a better candidate for removal than the least recently loaded
pages in FIFO. The LRU algorithm can be implemented by associating a counter with every
pagethat is inmain memory. Whenapage isreferenced, itsassociatedcounter isset to zero. At
fixed intervals of time, the counters associated with all pages presently in memory are
incremented by1. The least recentlyused page is the page withthe highest count. The counters
are often called aging registers, as their count indicates their age, that is, how long ago their
associated pages have been referenced.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 24
KNREDDY
PERIPHERALDEVICES
Theinput-outputsubsystemofacomputer,referredtoasI/O,providesanefficientmodeof
communication between the central system and the outside environment
Inputoroutputdevicesattachedtothecomputerarealsocalledperipherals.
InputDevices
Keyboard
Opticalinputdevices
- CardReader
- Barcodereader
- Digitizer
- OpticalMarkReade
r Screen Input Devices
- TouchScreen
- LightPen
- Mouse
AnalogInputDevices
OutputDevices
CRT
Printer(Impact,InkJet,Laser,DotMatrix)
Plotter
Speakers
Input and output devices that communicate with people and the computer are usually involved
in the transfer of alphanumeric information to and from the device and the computer is ASCII
(American Standard Code for Information Interchange).
ASCII is a 7 bit code, but most computers manipulate an 8-bit quantityas a single unit called a
byte. Therefore, ASCII characters most often are stored one per byte.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 25
KNREDDY
INPUT-OUTPUTINTERFACE
Input-output interface provides a method for transferring information between internal storage
and external I/O devices. Peripherals connected to a computer need special
communicationlinks for interfacing them with the central processing unit. The purpose of the
communication link is to resolve the differences that exist between the central computer and
each peripheral. The major differences are:
1. Peripherals are electromechanical and electromagnetic devices and their manner
ofoperation is different from the operation of the CPU and memory, which are electronic
devices.Therefore, a conversion of signal values may be required.
2. The data transfer rate of peripherals is usually slower than the transfer rate of the CPU, and
consequently, a synchronization mechanism may be need.
3. DatacodesandformatsinperipheralsdifferfromthewordformatintheCPUandmemory.
4. Theoperating modesofperipheralsaredifferent fromeachother andeachmust becontrolled so
as not to disturb the operation of other peripherals connected to the CPU.
To resolve these differences, computer systems include special hardware components between
the CPU and peripherals to supervise and synchronize all input and output transfers. These
components are called interface units because they interface between the processor bus and
the peripheral device. In addition, each device may have its own controller that supervises the
operations of the particular mechanism in the peripheral.
I/OBUSANDINTERFACEMODULES
AtypicalcommunicationlinkbetweentheprocessorandseveralperipheralsisshowninFig.
TheI/Obusconsistsofdatalines,addresslines,andcontrollines.
Each peripheral device has associated
withit an interface unit. Each interface
decodes the address and control received
from the I/O bus, interprets them for the
peripheral, and provides signals for the
peripheral
controller.Italsosynchronizesthedata
flow and supervises the transfer between peripheral and processor. Each peripheral has its own
controller that operates the particular electromechanical device.
To communicate with a particular device, the processor places a device address on the address
lines. Each interface attached to the I/O bus contains an address decoder that monitors the
address lines. When the interface detects its own address, it activates the path between the bus
lines and the device that it controls. All peripherals whose address does not correspond to the
address in the bus are disabled their interface.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 26
KNREDDY
Atthesametimetheprocessorprovidesafunctioncodeinthecontrollines.
There are four types of commands that an interface may receive. Theyare classified as control,
status, data output, and data input.
Acontrolcommand isissuedtoactivatetheperipheralandtoinformitwhatto do.
Astatuscommandisusedtotestvariousstatusconditionsintheinterfaceandtheperipheral.
A data output command causes the interface to respond by transferring data from the bus
intoone of its registers.
The data input command is the oppositeofthe data output. Inthis case the interface receives an
item of data from the peripheral and places it in its buffer register.
I/OVERSUSMEMORYBUS
Inadditionto communicating withI/O, the processor must communicate withthe memoryunit.
Like the I/O bus, the memorybus contains data, address, and read/writecontrol lines. There are
three ways that computer buses can be used to communicate with memory and I/O:
1. Usetwoseparatebuses,oneformemoryandtheotherfor I/O.
2. UseonecommonbusforbothmemoryandI/Obuthaveseparatecontrollinesforeach.
3. UseonecommonbusformemoryandI/Owithcommoncontrollines.
In the first method, the computer has independent sets of data, address, and control buses, one
for accessing memory and the other for I/O. This is done in computers that provide a separate
I/O processor (IOP) in addition to the central processing unit (CPU). The memory
communicates with both the CPU and the IOP through a memory bus. The IOP communicates
also withthe input andoutput devices througha separateI/Obus with itsownaddress, data and
control lines. The purpose of the IOP is to provide an independent pathway for the transfer of
information between external devices and internal memory
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 27
KNREDDY
ISOLATEDVERSUS MEMORY-MAPPEDI/O
Many computers use one common bus to transfer information between memory or I/O and the
CPU. Thedistinction betweena memorytransfer and I/O transfer is madethroughseparateread
and write lines. The CPU specifies whether the address on the address lines is for a memory
word or for an interface register by enabling one of two possible read or write lines. The I/O
read and I/O write control lines are enabled during an I/O transfer. The memory read and
memory write control lines are enabled during a memory transfer.
In the isolated I/O configuration, the CPU has distinct input and output instructions, and
eachof these instructions is associated with the address of an interface register. When the CPU
fetches and decodes the operation code of an input or output instruction, it places the address
associated with the instruction into the common address lines. At the same time, it enables the
I/O read (for input) or I/O write (for output) control line. This informs the externalcomponents
that are attached to the common bus that the address in the address lines is for an interface
register and not for amemoryword. Ontheotherhand, whentheCPU is fetching an instruction or
an operand from memory, it places the memory address onthe address lines and enables the
memory read or memory write control line. This informs the external components that the
address is for a memory word and not for an I/O interface.
The other alternative is to usethe same address space for bothmemoryand I/O. This is the case
in computers that employonlyone set ofread and write signals and do not distinguish between
memory and I/O addresses. This configuration is referred to as memory mapped I/O. The
computer treats an interface register as being part of the memory system.
In a memory-mapped I/O organization there is no specific input or output instructions.
TheCPU can manipulate I/O data residing in interface registers with the same instructions that
are used to manipulate memorywords. Each interface is organized as a set ofregistersthat
respond to readand write requests inthe normal address space. Typically, a segment
ofthetotaladdress space is reserved for interface registers, but in general, they can be located at
any address as long as there is not also a memory word that responds to the same address.
Computers with memory-mapped I/O can use memory-type instructions to access I/O data. It
allows the computerto usethe same instructions foreither input-outputtransfers orfor memory
transfers.
The advantage is that the load and store instructions used for reading and writing frommemory
can be used to input and output data from I/O registers.
Inatypicalcomputer,there are more memory-reference instructions thanI/O instructions. With
memory mapped I/O all instructions that refer to memory are also available for I/O. .
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 28
KNREDDY
EXAMPLEOFI/OINTERFACE
AnexampleofanI/Ointerfaceunitisshowninblockdiagram
It consists oftwo data registers called ports, a controlregister, astatusregister, bus buffers, and
timing and control circuits. The interface communicates with the CPU through the data bus.
The chip select and register select inputs determine the address assigned to the interface. The
I/O read and write are two control lines that specify an input or output, respectively.
The four registers communicate directly with the I/O device attached to the interface. The I/O
data to and from the device can be transferred into either port A or Port B.
The interface may operate with an output device or with an input device, or with a device that
requires both input and output..
A command is passed to the I/O device by sending a word to the appropriate interface register.
Inasystemlikethis, the functioncodeintheI/Obus is not neededbecausecontrolissent tothe
control register, status information is received from the status register, and data are transferred
to and fromports A and B registers. Thus the transfer ofdata, control, and status information is
always via the common data bus.
The distinction between data, control, or status information is determined from the particular
register with which the CPU communicates.
The control register receives control information from the CPU. By loading appropriate
bitsinto the controlregister,the interface and the I/Odevice attachedto it canbe placed ina
variety of operating modes.
TheinterfaceregisterscommunicatewiththeCPUthroughthebidirectionaldatabus.
The address bus selects the interface unit through the chip select and the two register select
inputs. Acircuit must be provided externally(usually, a decoder) to detect the address assigned
to the interface registers. This circuit enables the chip select (CS) input when the interface is
selected by the address bus. The two register select inputs RS1 and RS0 are usually
connectedtothetwo least significant lines ofthe lines address bus. These two inputs select one
ofthe four registers in the interface as specified in the table accompanying the diagram.
The content of the selected register is transfer into the CPU via the data bus when the I/O read
signal is enabled. The CPU transfers binary information into the selected register via the data
bus when the I/O write input is enabled.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 29
KNREDDY
ASYNCHRONOUSDATATRANSFER
The internal operations in a digital system are synchronized by means of clock pulses supplied
by a common pulse generator.
If the registers in the interface share a common clock with the CPU registers, the transfer
between the two units is said to be synchronous.
In most cases,the internaltiming ineachunit is independent fromtheother inthat eachuses its
own private clock for internal registers. In thatcase, the two units are said to be
asynchronousto each other.
Asynchronous data transfer between two independent units requires that control signals be
transmitted between the communicating units to indicate the time at which data is being
transmitted.
One wayofachieving this isby meansofastrobe pulsesupplied byone ofthe unitsto indicate to
the other unit when the transfer has to occur. Another method commonly used is to
accompany each data item being transferred with a control signal that indicates the presence of
data in the bus. The unit receiving the data item responds with another control signal to
acknowledge receipt of the data. This type of agreement between two independent units is
referred to as handshaking.
STROBECONTROL
The strobe control method of asynchronous data transfer employs a single control line to time
each transfer. The strobe may be activated by either the source or the destination unit.
The data bus carries the binary information from
source unit to the destination unit. Typically, the
bus has multiple lines to transfer an entire byte
or word. The strobe is a single line that informs
the destination unit when a valid data word is
available in the bus.
As shown in the timing diagram the source unit
first places the data on the data bus. After a brief
delay to ensure that the data settle to a steady
value, the source activates the strobe pulse.
The information on the data bus and the strobe signal remain in the active state for a sufficient
time period to allow the destination unit to receive the data. Often, the destination unit uses the
falling edge of the strobe pulse to transfer the contents of the data bus into one of its internal
registers.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 30
KNREDDY
ThefollowingFigureshowsadatatransferinitiatedbythedestinationunit.
Inthiscasethedestinationunit activatesthestrobepulse,
informing the sourceto provide the data. The source
unit responds by placing the requested binary
information on the data bus. The data must be valid and
remain in the bus long enough for the destination unit
to accept it. The falling edge of the strobe pulse can be
used again to trigger a destination register. The
destination unit then disables the strobe
Thetransfer ofdatabetweenthe CPU and an interfaceunit issimilar to thestrobetransfer. Data
transfer betweenan interface and anI/O device is commonlycontrolled bya setofhandshaking
lines
HANDSHAKING
The disadvantage of the strobe method is that the source unit that initiates the transfer has no
wayofknowing whether the destinationunit has actuallyreceived the dataitemthat was placed
inthebus. Similarly, adestinationunit that initiatesthetransfer hasno wayofknowingwhether the
source unit has actually placed the data on the bus. The handshake method solves this
problembyintroducing a second controlsignalthat provides a replytothe unit that initiates the
transfer.
The basic principle ofthe handshaking methodofdatatransfer is as follows. Onecontrolline is in
the same direction as the data flow in the bus fromthe sourceto the destination. It is used by
the source unit to informthe destination unit whether there are valued data in the bus.
The other control line is in the other direction from the destination to the source. It is used
bythe destination unit to inform the source whether it can accept data. The sequence of control
during the transfer depends on the unit that initiates the transfer.
The two handshaking lines are data valid, which is generated by the source unit, and data
accepted, generated by the destination unit.
The timing diagram shows the exchange of signals between the two units. In the destination-
initiated transfer the source does not place data on the bus until after it receives the ready for
data signal from the destination unit.
The handshaking scheme providesa highdegree offlexibilityand realitybecause the successful
completionofadatatransfer reliesonactiveparticipationby bothunits. Ifoneunit is faulty, the data
transfer will not be completed. Such an error can be detected by means of a timeout
mechanism, which produces an alarm if the data transfer is not completed within a
predeterminedtime.Thetimeoutisimplementedbymeansofaninternalclockthatstarts
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 31
KNREDDY
counting time when the unit enables one of its handshaking control signals. If the return
handshake signal does notrespond within a given time period, the unitassumes thatan error has
occurred
ASYNCHRONOUSSERIALTRANSFER
The transfer of data between two units may be done in parallel or serial. In parallel data
transmission, each bit of the message has its own path and the total message is transmitted
atthe same time.
Inserialdatatransmission,eachbitinthemessageissent insequenceoneatatime.
Parallel transmission is faster but requires many wires. It is used for short distances and where
speed is important. Serialtransmission is slower but is less expensive since it requires onlyone
pair of conductors.
Serial transmission can be synchronous or asynchronous. In synchronous transmission, the
two units share a common clock frequencyand bits are transmitted continuouslyat the rate
dictated by the clock pulses. In long-distantserial transmission, each unitis driven by a
separate clock of the same frequency. Synchronization signals are transmitted periodically
between the two units to keep their clocks in step with each other.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 32
KNREDDY
In asynchronous transmission, binary information is sent only when it is available and the line
remains idle when there is no information to be transmitted.
Serial asynchronous data transmission technique used in many interactive terminals employs
special bits that are inserted at both ends of the character code. With this technique, each
character consists of three parts: a start bit, the character bits, and stop bits.
The convention is that the transmitter rests at the 1-state when no characters are
transmitted.The first bit, called the start bit, is always a 0 and is used to indicate the beginning
of a character. The last bit called the stop bit is always a 1.
AnexampleofthisformatisshowninFig.
A transmitted character can be detected by the receiver from knowledge of the transmission
rules:
1. Whenacharacterisnotbeingsent,thelineiskeptinthe1-state.
2. Theinitiationofacharactertransmissionisdetectedfromthestartbit,whichisalways0.
3. Thecharacterbitsalwaysfollowthestartbit.
4. After the last bit ofthe character is transmitted, a stop bit is detected whenthe line returnsto
the 1-state for at least one bit time.
Usingtheserules, thereceiver candetectthestart bit whenthe linegives from1to0.Aclock in the
receiver examines the line at proper bit times. The receiver knows the transfer rate of thebits
and the number of character bits to accept. After the character bits are transmitted, one or two
stop bits are sent. The stop bits are always in the 1-state and frame the end of the character to
signify the idle or wait state.
At the end of the character the line is held at the 1-state for a period of at least one or two bit
times so that both the transmitter and receiver can resynchronize. The length of time that
theline stays in this state depends on the amount of time required for the equipment to
resynchronize.
Someolderelectromechanicalterminalsusetwostopbits,butnewerterminalsuseonestopbit.
The line remains in the 1-state untilanother character is transmitted. The stoptime ensures that
a new character will not follow for one or two bit times.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 33
KNREDDY
AsynchronousCommunicationInterface
TheblockdiagramofanasynchronouscommunicationinterfaceisshowninFig.
are appended to each character. Two bits in the status register are used as flags. One bit is used
to indicate whether the transmitter register is empty and another bit is used to indicate whether
the receiver register is full.
The operation ofthe transmitter portionof the interface is as follows. The CPU reads the status
register and checks the flag to see if the transmitter register is empty. If it is empty, the CPU
transfers a character to the transmitter register and the interface clears the flag to mark the
register full. The first bit in the transmitter shift register is set to 0 to generate a start bit. The
character is transferred in parallel from the transmitter register to the shift register and the
appropriate number of stop bits are appended into the shift register. The transmitter register is
then marked empty. The character can now be transmitted one bit at atime by shifting the data
in the shift register at the specifiedbaud rate. The CPU can transfer another character to the
transmitter register after checking the flag in the status register. The interface is said to be
double buffered because a new character can be loaded as soon as the previous one starts
transmission.
The operation of the receiver portion of the interface is similar. The receive data input is in the
1-state when thelineis idle. The receiver controlmonitors the receive-data linefor a 0 signal to
detect theoccurrenceofastart bit. Onceastart bit has beendetected,the incoming bitsofthe
character are shifted into the shift register at the prescribed baud rate. After receiving the data
bits, the interface checks for the parity and stop bits. The character without the start and
stopbits is then transferred in parallel from the shift register to the receiver register. The flag in
the status register is set to indicate that the receiver register is full. The CPU reads the
statusregister and checks the flag, and if set, it reads the data fromthe receiver register. The
interface checks for any possible errors during transmission and sets appropriate bits in the
statusregister. The CPU can read the status register at any time to check if any errors have
occurred. Three possible errors that the interface checks during transmission are parity error,
framing error, and overrun error. Parity error occurs if the number of l's in the received data is
not the correct parity. A framing error occurs if the right number of stop bits is not detected at
the end of the received character. An overrun error occurs if the CPU does not read the
character from the receiver register before the next one becomes available in the shift register.
Overrun error results in a loss of characters in the received data stream.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 35
KNREDDY
First-In,First-OutBuffer
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 36
KNREDDY
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 37
KNREDDY
MODESOFTRANSFER
Binary information received from an external device is usually stored in memory for later
processing. Informationtransferred fromthe centralcomputer into anexternaldevice originates
in the memory unit. The CPU merely executes the I/O instructions and may accept the data
temporarily, but the ultimate source or destination is the memory unit.
Data transfer between the central computer and I/O devices may be handled in a variety of
modes. Some modes use the CPUas an intermediatepath;other transfer thedata directlyto and
from the memory unit.
Datatransfertoandfromperipheralsmaybehandledinoneofthreepossible modes:
1. Programmed I/O
2. Interrupt-initiatedI/O
3. Directmemoryaccess(DMA)
Programmed I/O operations are the result of I/O instructions written in the computer program.
Each data item transfer is initiated by an instruction in the program. Usually, the transfer is to
and fromaCPUregister andperipheral. Other instructionsareneededtotransfer thedatato and
from CPU and memory. Transferring data under program control requires constant monitoring
of the peripheral by the CPU. Once a data transfer is initiated, the CPU is required to monitor
the interface to see when a transfer can again be made. It is up to the programmed instructions
executed in the CPU to keep close tabs on everything that is taking place in the interface unit
and the I/O device.
In the programmed I/O method, the CPU stays in a program loop until the I/O
unitindicatesthat it is ready for data transfer. This is a time-consuming process since it keeps
the processor busy needlessly. It can be avoided by using an interrupt facility and special
commands toinform the interface to issue an interrupt request signal when the data are
available from the device. In the meantime the CU can proceed to execute another program.
The interface meanwhile keeps monitoring the device. Whenthe interface determines that the
device is ready for data transfer, it generates an interrupt request to the computer. Upon
detecting the external interrupt signal, the CPU momentarily stops the task it is processing,
branches to a service program to process the I/O transfer, and then returns to the task it was
originally performing Transfer of data under programmed I/O is between CPU and peripheral.
In direct memory access (DMA), the interface transfers data into and out of the memory unit
through the memory bus. The CPU initiates the transfer by supplying the interface with the
starting address and the number ofwords needed to betransferred and thenproceeds to execute
other tasks. When the transfer is made, the DMA requests memory cycles through the memory
bus. When the request is granted bythe memorycontroller, the DMA transfers the data directly
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 38
KNREDDY
into memory. The CPU merely delays its memory access operation to allow the direct memory
I/O transfer. Since peripheral speed is usually slower than processor speed, I/O-memory
transfers are infrequent compared to processor access to memory.
Many computers combine the interface logic with the requirements for direct memory access
into one unit and callit anI/O processor(IOP). The IOPcanhandle manyperipherals througha
DMPA and interrupt facility. In such a system, the computer is divided into three separate
modules: the memory unit, the CPU, and the IOP.
EXAMPLEOFPROGRAMMEDI/O
In the programmed I/O method, the I/O device does not have direct access to memory. A
transfer from an I/O device to memory requires the execution of several instructions by the
CPU, including an input instruction to transfer the data fromthe device to the CPU, and a store
instruction to transfer the data from the CPU to memory. Other instructions may be needed to
verifythat the data are available fromthe device and to countthe numbers ofwordstransferred.
An example of data transfer from an I/O device through an interface into the CPU is shown in
Fig.
The device transfers bytes of data one at a time as they are available. When a byte of data is
available,thedeviceplaces it intheI/Obusandenables itsdatavalid line.The interfaceaccepts the
byte into its data register and enables the data accepted line. The interface sets a bit in the
status register that we will refer to as an F or “flag” bit. The device can now disable the data
valid line, but it will not transfer another byte until the data accepted line is disabled by the
interface.
A program is written for the computer to check the flag in the status register to determine if a
byte has been placed in the data register by the I/O device. This is done by reading the status
register into a CPU register and checking the value of the flag bit. If the flag is equal to 1, the
CPU reads the data fromthe data register. The flag bit is then cleared to 0 byeither the CPU or
the interface, depending onhowthe interface circuits are designed. Once the flag is cleared, the
interface disables the data accepted line and the device can then transfer the next data byte.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 39 KNREDDY
Itisassumedthatthedeviceissending a
sequence of bytes that must
bestoredinmemory.Thetransferofeach
byte requires three instructions:
1. Readthestatusregister.
2. Check the status of the flag bit and
branch to step 1 if not set or to step 3if
set.
3. Readthedataregister.
Each byte is read into a CPU register
and then transferred to memorywith a
store instruction. A common I/O
programming task is to transfer ablock
ofwordsformanI/O device and store
them in a memory buffer.
The programmed I/O method is
particularly useful in small low-speed
computers or in systems that are
dedicated to monitor a device
continuously.Thedifferencein
information transfer rate between the CPU and the I/O device makes this type of transfer
inefficient.
INTERRUPT-INITIATEDI/O
An alternative to the CPU constantly monitoring the flag is to let the interface inform the
computer when it is ready to transfer data. This mode of transfer uses the interrupt facility.
While the CPU is running a program, it does not check the flag. However, when the flag is set,
the computer is momentarily interrupted from proceeding with the current program and is
informed of the fact that the flag has been set.
The CPU deviates from what it is doing to take care of the input or output transfer. After the
transfer is completed, the computer returns to the previous program to continue what it was
doing before the interrupt.
The CPU responds to the interrupt signal by storing the return address from the
programcounterintoamemorystackandthencontrolbranchestoaserviceroutinethatprocessesthe
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 40
KNREDDY
required I/O transfer. The way that the processor chooses the branch address of the service
routine varies from tone unit to another. In principle, there are two methods for accomplishing
this. One is called vectored interrupt and the other, no vectored interrupt. In a non vectored
interrupt, the branch address is assigned to a fixed location in memory. In a vectored interrupt,
the source that interrupts supplies the branch information to the computer. This information is
called the interrupt vector. In some computers the interrupt vector is the first address of the I/O
service routine. Inother computers the interrupt vector is an address that points to a location in
memory where the beginning address of the I/O service routine is stored.
SOFTWARECONSIDERATIONS
The previous discussionwas concerned withthe basic hardware needed to interface I/O devices
to a computer system. A computer must also have software routines for controlling peripherals
and for transfer of data between the processor and peripherals. I/O routines must issue control
commands to activate the peripheral and to check the device status to determine when
itisready for datatransfer. Once ready, information is transferred itemby itemuntil allthe data
are transferred. In some cases, a control command is then given to execute a device function
suchas stop tape or print characters. Error checking and other useful steps often accompany
the transfers.
In interrupt-controlled transfers, the I/O software must issue commands to the peripheral to
interrupt when ready and to service the interrupt when it occurs. In DMA transfer, the I/O
software must initiate the DMA channel to start its operation.
Software control of input-output equipment is a complex undertaking. For this reason I/O
routines for standard peripherals are provided by the manufacturer as part of the computer
system. They are usually included within the operating system. Most operating systems are
supplied with a variety of I/O programs to support the particular line of peripherals offered for
the computer. I/O routines are usually available as operating system procedures and the user
refers to the established routines to specify the type of transfer required without going into
detailed machine language programs.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 41
KNREDDY
PRIORITY INTERRUPT
Data transfer between the CPU and an I/O deviceis initiated by the CPU. The CPU
cannotstart the transfer unless the device is readyto communicate with the CPU. The readiness
of the device can be determined from an interrupt signal.
Numbers of I/O devices are attached to the computer; several sources will request service
simultaneously. The first task of the interrupt system is to identify the source of the
interruptand decide which device to service first
Apriority interrupts is a system to determine which interrupt is to be served first when two or
more requests are made simultaneously. Also determines which interrupts are permitted to
interrupt the computer while another is being serviced. Higher priority interrupts can make
requests while servicing a lower priority interrupt
Establishingthepriorityofsimultaneousinterruptscanbedonebysoftwareorhardware.
PriorityInterrupt bySoftware(Polling)
- Priorityisestablishedbytheorderofpollingthedevices(interruptsources)
- Flexiblesinceitisestablishedbysoftware
- Lowcostsinceitneedsaverylittlehardware
- Veryslow
PriorityInterruptbyHardware
- Requireapriorityinterruptmanager whichacceptsallthe interruptrequeststodetermine the
highest priority request
- Fastsinceidentificationofthehighestpriorityinterruptrequestisidentifiedbythehardware. Each
interrupt source has its own interrupt vector to access directly to its own service routine
The hardware priority function can be established by either a serial or a parallel connection of
interrupt lines. The serial connection is also known as the daisy chaining method.
DAISY-CHAININGPRIORITY
The daisy-chaining method ofestablishing priority consists ofa serial connectionofall devices
that request an interrupt. The device with the highest priority is placed in the first position,
followedbylower-prioritydevicesuptothedevicewiththe lowest priority, whichisplaced last in
the chain.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 42
KNREDDY
The interrupt request line is common to all devices and forms a wired logic connection. If any
device has its interrupt signal inthe low-levelstate, the interrupt line goes to the low-levelstate
and enables the interrupt input in the CPU. When no interrupts are pending, the interrupt line
stays in the high-level state and no interrupts are recognized by the CPU.
The CPU responds to an interrupt request by enabling the interrupt acknowledge line. This
signal is received by device 1 at its PI (priority in) input. The acknowledge signal passes on to
the next device through the PO (priority out) output only if device 1 is not requesting an
interrupt.
If device 1 has a pending interrupt, it blocks the acknowledge signal from the next device by
placing a 0 in the PO output. It then proceeds to insert its own interrupt vector address (VAD)
into the data bus for the CPU to use during the interrupt cycle.
The device with PI = 1 and PO = 0 is the one with the highest priority that is requesting an
interrupt, and this device places its VAD on the data bus.
The following figure shows the internal logic that must be included with in each device when
connected in the daisy-chaining scheme.
Fig:Onestageofthedaisy-chainpriorityarrangement
The device sets its RF flip-flop when it wants to interrupt the CPU. The output of the RF flip-
flop goes through an open-collector inverter, a circuit that provides the wired logic for the
common interrupt line.
IfPI=0,bothPOandtheenable linetoVADareequalto0,irrespectiveofthevalueofRF.
If PI = 1 and RF = 0, then PO = 1 and the vector address is disabled. This condition passes the
acknowledge signal to the next device through PO.
The device is active when PI = 1 and RF = 1. This condition places a 0 in PO and enables the
vector address for the data bus. It is assumed that each device has its own distinct vector
address.
The RF flip-flop is reset after a sufficient delay to ensure that the CPU has received the vector
address.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 43
KNREDDY
PARALLELPRIORITY INTERRUPT
The parallel priority interrupt method uses a register whose bits are set separately by the
interrupt signal from each device.
Priority is established according to the position of the bits in the register. In addition to the
interrupt register the circuit may include a mask register whose purpose is to control the status
of each interrupt request.
The mask register can be programmed to disable lower-priority interrupts while a higher-
prioritydevice is being serviced. It can also provide a facilitythat allows a high-prioritydevice
to interrupt the CPU while a lower-priority device is being serviced.
TheprioritylogicforasystemoffourinterruptsourcesisshowninFig.
It consists of an interrupt register whose individual bits are set by external conditions and
cleared by program instructions.
The mask register has the same number of bits as the interrupt register. By means of program
instructions, it is possible to set or reset any bit in the mask register.
Each interrupt bit and its corresponding mask bit are applied to an AND gate to produce
thefour inputsto a priorityencoder. Inthis wayan interrupt is recognized only if its
corresponding mask bit is set to 1 by the program.
Thepriorityencodergeneratestwobitsofthevectoraddress,whichistransferredtotheCPU.
Another output from the encoder sets an interrupt status flip-flop IST when an interrupt that is
not masked occurs.
Theinterruptenableflip-flopIENcan besetorclearedby theprogram toprovidean overall control
over the interrupt system.
TheoutputsofISTANDedwithIENprovideacommoninterrupt signalfortheCPU.
The interrupt acknowledge INTACK signal fromthe CPU enables the bus buffers in the output
register and a vector address VAD is placed into the data bus.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 44
KNREDDY
PriorityEncoder
The priority encoder is a circuit that implements the priority function. The logic of the priority
encoder is such that if two or more inputs arrive at the same time, the input having the highest
priority will take precedence.
InterruptCycle
Theinterruptenableflip-flopIENcanbesetorclearedbyprograminstructions.
WhenIENiscleared,theinterruptrequestcomingfromISTisneglectedbytheCPU.
The program-controlled IEN bit allows the programmer to choose whether to use the interrupt
facility. If an instruction to clear IEN has been inserted in the program, it means that the user
does not want his program to be interrupted. An instruction to set IEN indicates that the
interrupt facility will be used while the current program is running.
Most computers include internal hardware that clears IEN to 0 every time an interrupt is
acknowledged by the processor
At the end of each instruction cycle the CPU checks IEN and the interrupt signal from IST. If
either is equal to 0, control continues with the next instruction.
IfbothIENandISTareequalto1,theCPUgoestoaninterruptcycle.
Duringthe interruptcycletheCPUperformsthe followingsequenceofmicrooperations: SP
SP-1 Decrement stack pointer
M[SP]PC PushPCinto stack
INTACK 1
Enableinterruptacknowledge PCVAD
TransfervectoraddresstoPC
lEN 0
SoftwareRoutines
Apriorityinterrupt systemisacombinationofhardwareandsoftwaretechniques
The following figure shows the programs that must reside in memory for handling the
interrupt system.
InitialandFinalOperations
Each interrupt service routine must have an initial and finalset ofoperations for controlling the
registers in the hardware interrupt system
InitialSequence
[1] ClearlowerlevelMaskreg.bits
[2] IST0
[3] SavecontentsofCPUregisters
[4] IEN1
[5] GotoInterruptServiceRoutine
FinalSequence
[1] IEN 0
[2] RestoreCPUregisters
[3] ClearthebitintheInterruptReg
[4] SetlowerlevelMaskreg.bits
[5] RestorereturnaddressintoPC,andIEN1
The initial and final operations are referred to as overhead operations or
housekeepingchores. They are not part of the service program proper but are essential for
processing interrupts.
All overhead operations can be implemented by software. This is done by inserting the proper
instructions at the beginning and at the end of each service routine. Some of the overhead
operations can be done automatically by the hardware
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 46
KNREDDY
DIRECTMEMORYACCESS(DMA):
The transfer of data between a fast storage device such as magnetic disk and memory is often
limited by the speed of the CPU. Removing the CPU from the path and letting the peripheral
device manage the memory buses directly would improve the speed of transfer. This transfer
technique is called direct memory access (DMA).
DuringDMAtransfer,theCPUisidleandhasnocontrolofthememorybuses.
A DMA controller takes over the buses to manage the transfer directly between the I/O device
and memory.
The CPU may be placed in an idle state in a varietyof ways. One common method extensively
used in microprocessors is to disable the buses through special control signals.
The bus request (BR) input is used by the DMA controller to request the CPU to cease control
of the buses. When this input is active, the CPU terminates the execution of the current
instruction and places the address bus, the data bus, and the read and write lines into a high-
impedance state behaves like an open circuit, which means that the output is disconnected and
does not have a logic significance.
The CPU activates the Bus grant (BG) output to inform the external DMA that the buses are in
the high-impedance state. The DMA that originated the bus request can now take controlofthe
buses to conduct memory transfers without processor intervention. When the DMA terminates
thetransfer, it disablesthe busrequest line. TheCPU disablesthe busgrant,takescontrolofthe
buses, and returns to its normal operation.
When the DMA takes control of the bus system,it communicates directly with the memory.
The transfer canbe made inseveralways. InDMAburst transfer, a block sequenceconsisting of
a number of memorywords is transferred in a continuous burst while the DMAcontroller is
master of the memory buses. This mode of transfer is needed for fast devices such as magnetic
disks, where data transmission cannot be stopped or slowed down until an entire block is
transferred.
An alternative technique called cycle stealing allows the DMA controller to transfer one data
word at a time after which it must return control of the buses to the CPU. The CPU merely
delays itsoperationforone memorycycleto allowthedirect memoryI/Otransfer to“steal”one
memory cycle.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 47
KNREDDY
DMA CONTROLLER
Thefollowing figureshowstheblockdiagramofatypicalDMAcontroller
The unit communicates with the CPU via the data bus and control lines. The registers in the
DMA are selected by the CPU through the address bus by enabling the DS (DMA select) and
RS (register select) inputs. The RD (read) and WR (write) inputs are bidirectional.
Whenthe BG (busgrant) input is0, the CPU cancommunicate withthe DMAregistersthrough
the data bus to read from or write to the DMA registers.
When BG = 1, the CPU has relinquished(ceased) the buses and the DMA can communicate
directly with the memory by specifying an address in the address bus and activating the RD or
WR control.
The DMA communicates with the external peripheral through the request and acknowledge
lines by using a prescribed handshaking procedure.
TheDMAcontroller hasthreeregisters: anaddressregister, aword count register, and acontrol
register. Theaddressregister containsanaddressto specifythedesired locationinmemory. The
address bits go through bus buffers into the address bus. The address register is incremented
after each word that is transferred to memory.
Thewordcount register is incrementedafter eachwordthat istransferredto memory. Theword
count register holds the number of words to be transferred. This register is decremented byone
after each word transfer and internally tested for zero.
The controlregister specifies the mode oftransfer. All registers in the DMA appear to the CPU
as I/O interface registers. Thus the CPU can read from or write into the DMA registers under
program control via the data bus.
The DMA is first initialized by the CPU. After that, the DMA starts and continues to transfer
data between memory and peripheral unit until an entire block is transferred. The initialization
processisessentiallyaprogramconsistingofI/Oinstructionsthatincludetheaddressfor
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 48
KNREDDY
selectingparticularDMAregisters.TheCPUinitializestheDMAbysendingthefollowing information
through the data bus:
1. The starting addressofthe memoryblock where data are available (for read)orwhere data areto
be stored (for write)
2. Thewordcount,whichisthenumberofwordsinthememoryblock
3. Controltospecifythe modeoftransfersuchasreadorwrite
4. AcontroltostarttheDMAtransfer
Thestartingaddress is storedintheaddressregister. Thewordcount isstoredinthewordcount
register, and the control information in the control register.
Oncethe DMA is initialized, theCPUstopscommunicating withtheDMAunless it receivesan
interrupt signal or if it wants to check how many words have been transferred.
DMA Transfer
ThepositionoftheDMAcontrolleramongtheothercomponentsinacomputersystemis illustrated in
following fig.
The CPU communicates with the DMA through the address and data buses as with any
interface unit. The DMA has its own address, which activates the DS and RS lines.
The CPU initializes the DMA through the data bus. Once the DMA receives the start control
command, it can start the transfer between the peripheral device and the memory.
When the peripheral device sends a DMA request, the DMA controller activates the BR line,
informing the CPU to relinquish the buses. The CPU responds with its BG line, informing the
DMA that its buses are disabled.
The DMA then puts the current value of its address register into the address bus, initiates the
RD or WR signal, and sends a DMA acknowledge to the peripheral device.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-IV 49
KNREDDY
Note that the RD and WR lines in the DMA controller are bidirectional. The direction of
transfer depends on the status of the BG line. When BG = 0, the RD and WR are input lines
allowing the CPU to communicate with the internal DMA registers. WhenBG = 1, the RD and
WR and output lines from the DMA controller to the random-access memory to specify
theread or write operation for the data.
WhentheperipheraldevicereceivesaDMAacknowledge,itputsawordinthedatabus (for write) or
receives a word from the data bus (for read). Thus the DMA controls the read or write
operations and supplies the address for the memory.
The peripheral unit can then communicate with memorythrough the data bus for direct transfer
between the two units while the CPU is momentarily disabled.
For each word that is transferred, the DMA increments its address register and decrements its
word count register. If the word count does not reach zero, the DMA checks the request line
coming from the peripheral.
For a high-speed device, the line will be active as soon as the previous transfer is completed.
A second transfer is then initiated, and the process continues until the entire block is
transferred.
If the peripheral speed is slower, the DMA request line may come somewhat later. In this case
the DMA disables the bus request line so that the CPU can continue to execute its program.
When the peripheral requests a transfer, the DMA requests the buses again.
Ifthe word count register reaches zero,the DMA stops any further transfer and removes its bus
request. It also informs the CPU of the termination by means of an interrupt.
When the CPU responds to the interrupt, it reads the content of the word count register. The
zero value of this register indicates that all the words were transferred successfully. The CPU
can read this register at any time to check the number of words already transferred.
A DMA controller may have more than one channel. In this case, each channel has a
requestand acknowledges pair of control signals which are connected to separate peripheral
devices. Each channel also has its own address register and word count register within the
DMA controller.
A priority among the channels may be established so that channels with high priority are
serviced before channels with lower priority.
DMAtransfer isveryusefulin manyapplications.
Itisusedforfasttransferofinformationbetweenmagneticdisksand memory.
Itisalsousefulforupdatingthedisplayinaninteractiveterminal.
COMPUTERORGANIZATIONANDARCHITECTURE
UNIT-V
PIPELINEANDMULTIPROCESSORS
Pipeline
ParallelProcessing
Pipelining
Arithmetic Pipeline
InstructionPipeline.
Multiprocessors
CharacteristicsofMultiprocessors
Interconnection Structures
Inter Processor Arbitration
Inter Processor Communication and Synchronization
PARALLELPROCESSING
Aparallelprocessing system is able toperform concurrentdataprocessing toachieve faster
executiontime
Thesystemmay havetwoormoreALUsandbe able to executetwoormoreinstructionsatthe same time
Also,thesystemmayhavetwoormoreprocessorsoperating concurrently
Goal istoincrease the throughput –theamountofprocessingthat can be accomplishedduringa given
interval of time
Parallelprocessingincreasestheamountofhardwarerequired
Example:theALUcanbeseparatedintothreeunits andthe operandsdivertedtoeachunitunder the
supervision of a control unit
Allunitsareindependentofeachother
A multifunctional organization is
usuallyassociated with a complex control unit to
coordinate all the activities among the various
components
Parallelprocessingcanbeclassifiedfrom:
Theinternalorganizationofthe processors
The interconnection structure
between processors
Theflowofinformationthroughthesystem
Thenumberofinstructionsanddataitems
that are manipulated simultaneously
The sequence of instructions read from memory
is the instruction stream
Theoperationsperformedonthedataintheprocessoristhedatastream
Parallelprocessingmayoccurintheinstructionstream,thedatastream,orboth
Flynn’sComputer classification:
Singleinstructionstream,singledatastream–SISD
Singleinstructionstream,multipledatastream–SIMD
Multipleinstructionstream,singledatastream–MISD
Multipleinstructionstream,multipledatastream–MIMD
SISD–Instructionsareexecutedsequentially. Parallelprocessing maybeachievedbymeansof
multiple functional units or bypipeline processing
SIMD–Includes multipleprocessingunitswithasinglecontrolunit. Allprocessorsreceivethe same
instruction, but operate on different data.
MIMD–Acomputersystemcapableofprocessingseveralprogramsatthesametime.
PIPELINING
Pipelining is a technique of decomposing a sequential process intosub operations, with each
sub process being executed in a special dedicated segment that operates concurrently with all
other segments
Eachsegmentperformspartialprocessingdictatedbythewaythetask is partitioned
Theresultobtained fromthe computation ineachsegmentistransferred tothenext segmentin the
pipeline
The finalresultisobtainedafterthedatahavepassedthroughallsegments
Eachsegmentconsistsofaninputregisterfollowedbyancombinationalcircuit
Aclockisappliedtoallregistersafterenoughtimehaselapsedto performallsegmentactivity
Theinformationflowsthroughthepipelineonestepatatime
Example: Ai*Bi+Ci fori= 1,2,3,…,7
Thesuboperationsperformed ineachsegment are:
R1 ← Ai , R2 ← Bi
R3←R1*R2,R4←Ci R5 ←
R3 + R4
Asthenumber oftasksincrease,thespeedupbecomes
Therefore,thetheoreticalmaximumspeedup thatapipelinecanprovideisk
Example:
Cycle time =tp=20 ns #ofsegments =k =4 #oftasks=n=100
The pipelinesystemwilltake (k+ n–1)tp=(4+100–1)20ns= 2060ns
Assumingthattn =ktp= 4*20=80ns,
Anon-pipeline systemrequiresnktp=100* 80=8000 ns
Thespeedup ratio=8000/2060=3.88
Thepipelinecannotoperateatitsmaximumtheoreticalrate
One reason is that the clock cyclemustbe chosen toequalthe time delayofthe segmentwith the
maximum propagation time
Pipelineorganizationisapplicable forarithmeticoperationsand fetching instructions
ARITHMETIC PIPELINE
Pipelinearithmeticunitsareusuallyfoundinveryhighspeedcomputers
Theyare used to implementfloating-pointoperations,multiplication of fixed-pointnumbers, and
similar computations encountered in scientific problems
Exampleforfloating-pointadditionandsubtraction
Inputsaretwonormalizedfloating-pointbinarynumbers
X = A x 2a
Y=B x2b
AandBaretwo fractionsthat representthe
mantissas
aandbare theexponents
Four segments are used to
performthe following:
Comparetheexponents
Alignthemantissas
Addorsubtractthe mantissas
Normalizetheresult
X =0.9504x103andY=0.8200x102
Thetwoexponentsaresubtracted in thefirst
segment to obtain 3-2=1
The larger exponent 3ischosenasthe
exponent of the result
Segment2 shiftsthe mantissa ofYtothe right to
obtain Y = 0.0820 x 103
Themantissasarenowaligned
Segment3 producesthesumZ=1.0324x103
Segment 4 normalizes the result by shifting the mantissa once to the right and incrementing
the exponentbyone to obtain Z = 0.10324 x 104
The comparator, shifter, adder-subtractor, incrementer, and decrementer in the floating-point
pipeline are implemented with combinational circuits.
Supposethatthetimedelaysofthefoursegmentsaret1=60ns,t2=70ns,t3=100ns, t4= 80ns, and the
interface registers have a delayoftr = 10ns. The clock cycle is chosento be tp = t3 + tr = 110 ns.
An equivalent non-pipeline floating point adder-subtractor will have a delay time tn = t1 + t2 +
t3 + t4 + tr = 320ns. In this case the pipelined adder has a speedup of 32011 10 = 2. 9over the
non-pipelined adder.
INSTRUCTIONPIPELINE
Aninstructionpipelinereadsconsecutive instructionsfrommemorywhilepreviousinstructions are
being executed in other segments
Thiscausesthe instructionfetchandexecutephasestooverlapandperformsimultaneous
operations
Ifabranch out ofsequenceoccurs, thepipelinemustbeemptiedandall theinstructionsthat have
beenread frommemoryafter the branch instruction must be discarded
Consider a computer withan instruction fetchunit and an instructionexecutionunitforming a
two segment pipeline
AFIFObuffercanbeusedforthefetchsegment
Thus, an instruction streamcan be placedin a queue, waitingfor decoding and processing by
the execution segment
Thisreducestheaverageaccesstimetomemoryfor readinginstructions
Wheneverthereisspaceinthebuffer,thecontrolunitinitiatesthenextinstructionfetchphase
The followingstepsareneededtoprocesseachinstruction:
1. Fetchtheinstructionfrommemory
2. Decodetheinstruction
3. Calculatetheeffectiveaddress
4. Fetchtheoperandsfrommemory
5. Executetheinstruction
6. Storetheresultinthe properplace
Thepipeline maynot performat itsmaximumrate
due to:
Differentsegmentstakingdifferenttimestooperate
Somesegmentbeingskippedforcertainoperations
Memoryaccessconflicts
Example:Four-segmentinstructionpipeline
Assume that the decoding can be combined with
calculating the EA in one segment
Assumethat mostofthe instructionsstoretheresult in
a register so that the execution and storing ofthe
result can be combined in one segment
While an instruction is being executed in segment 4, the next instruction in sequence is busy
fetchingan operandfrom memory in segment3. Theeffectiveaddressmay becalculatedin a
separate arithmetic circuitfor the third instruction, and whenever the memory is available, the
fourthand allsubsequentinstructions canbe fetched and placed inan instruction FIFO
Upto four suboperations inthe instructioncycle canoverlap and upto four different
instructions can be inprogress ofbeing processed at the same time
Thefollowingfigureshowsthe operationof theinstructionpipeline.Thefoursegmentsare
represented in the diagramwith an abbreviated symbol.
FI:Fetchaninstruction frommemory
DA:Decodetheinstructionandcalculatetheeffectiveaddressofthe operand
FO:Fetchthe operand
EX:Executetheoperation
Itisassumedthattheprocessorhasseparateinstructionanddatamemories
Assume now that instruction 3 is a branch instruction. As soon as this instruction is decoded in
segment DA in step 4, the transfer from FI to DA of the other instructions is halted until the
branch instruction is executed in step 6. If the branch is taken, a new instruction is fetched in
step 7. If the branch is not taken, the instruction fetched previously in step 4 can be used. The
pipeline then continues until a new branch instruction is encountered.
Another delay may occur in the pipeline if the EX segment needs to store the result of the
operation in the data memory while the FO segment needs to fetch an operand. In that case,
segment FO must wait until segment EX has finished itsoperation.
Reasonsforthepipelinetodeviatefromitsnormaloperationare:
Resourceconflictscausedby accesstomemory by twosegmentsat thesametime.Most of these
instructions can be resolved byusing separate instructionand data memories.
Datadependency conflictsarisewhenaninstructiondependsontheresultofaprevious
instruction, but his resultis not yet available
BranchdifficultiesarisefromprogramcontrolinstructionsthatmaychangethevalueofPC
Methodstohandledatadependency:
Hardware interlocks are circuits that detect instructions whose source operands are
destinations of prior instructions. Detection causes the hardware to insert the required delays
without altering the program sequence.
Operand forwarding usesspecialhardware to detect a conflict and thenavoid it byrouting the
data through special paths between pipeline segments. This requires additional hardware paths
through multiplexers as wellas the circuit to detect the conflict.
Delayed load is a procedure that gives the responsibility for solving data conflicts to the
compiler. The compiler is designed to detect a data conflict and reorder the instructions as
necessaryto delaythe loading ofthe conflicting data byinserting no-operation instructions.
Methodstohandlebranchinstructions:
Prefetching the target instruction in addition to the next instruction allows eitherinstruction
to be available.
A branch target buffer (BTB) is an associative memory included in the fetch segment of the
branch instruction that stores the target instruction for a previously executed branch. It also
stores the next few instructions after the branch target instruction. This way, the branch
instructions that have occurred previously are readily available in the pipeline without
interruption.
The loop buffer is a variation ofthe BTB. It is a small very high speed register file maintained
bythe instruction fetchsegment ofthe pipeline. Stores all branches within a loop segment.
Branch prediction uses some additional logic to guess the outcome of a conditional branch
instruction before it is executed. The pipeline then begins prefetching instructions from the
predicted path.
Delayed branch is used in most RISC processors so that the compiler rearranges the
instructions to delay the branch.
CHARACTERISTICSOFMULTIPROCESSORS
A multiprocessor system is an interconnection of two or more CPUs with memory and input-
output equipment. The term"processor" in multiprocessor can mean either a centralprocessing
unit (CPU) or an input-output processor (lOP).
As it is most commonly defined, a multiprocessor system implies the existence of multiple
CPUs. Multiprocessors are classified as multiple instruction stream, multiple data stream
(MIMD) systems.
There are some similarities between multiprocessor and multicomputer systems since both
support concurrent operations. The networkconsistsofseveralautonomous computers that may
or may not communicate with each other. A multiprocessor system is controlled by one
operating system that provides interaction between processors and all the components of the
systemcooperate in the solution of a problem.
Although some large-scale computers include two or more CPUs in their overall system.
Microprocessors take very little physical space and are very inexpensive brings about the
feasibilityofinterconnecting a large number ofmicroprocessors into one composite system.
Very-large-scaleintegrated circuittechnologyhasreducedthecostofcomputer components
Multiprocessing improves the reliabilityofthe systemso that a failure or errorinone part has a
limited effect on the rest ofthe system.
The benefit derived from amultiprocessor organization is an improved system performance.
The systemderives its high performance in one oftwo ways.
1. Multipleindependentjobscanbemadetooperateinparallel.
2. Asinglejobcanbepartitionedintomultipleparalleltasks.
Multiprocessorsareclassified bythewaytheirmemoryisorganized.
A multiprocessor system with common shared memory is classified as a shared memory or
tightly coupled multiprocessor. Most commercial tightly coupled multiprocessors provide a
cachememory with each CPU.
Analternative modelofmicroprocessoristhe distributed-memoryorlooselycoupledsystem.
Eachprocessor elementina looselycoupled system has its ownprivate localmemory.
Loosely coupled systems are most efficient when the interaction between tasks is minimal,
whereas tightlycoupled systems cantolerate a higher degree of interactionbetweentasks.
INTERCONNECTION STRUCTURES
The components that form a multiprocessor system are CPUs, lOPs connected to input-output
devices, and a memoryunit that maybe partitioned into anumber ofseparate modules.
The interconnection between the components can have different physical configurations,
depending on the number of transfer paths that are available between the processors and
memory in a shared memory system or among the processing elements in a loosely coupled
system.
There are severalphysicalformsavailable for establishing an interconnection network.Some of
these schemes are:
1. Time-sharedcommonbus
2. Multiportmemory
3. Crossbarswitch
4. Multistageswitchingnetwork
5. Hypercubesystem
Time-SharedCommonBus
A common-bus multiprocessor system consists
of a number of processors connected through a
common path to a memory unit. A time-shared
commonbusfor fiveprocessorsisshowninFig.
Onlyoneprocessorcancommunicatewiththememoryoranotherprocessorat anygiventime.
Transferoperationsareconducted bytheprocessorthatisincontrolofthebusatthetime.
A command is issued to inform the destination unit what operation is to be performed. The
receiving unit recognizes its address in the bus and responds to the control signals from the
sender, after which the transfer is initiated.
The transfer conflicts must be resolved by incorporating a bus controller that establishes
priorities among the requesting units.
Asingle common-bussystemisrestrictedtoonetransferatatime.
Theprocessorsinthe systemcan be kept busy moreoftenthroughthe implementationoftwoor
more independent buses to permitmultiple simultaneous bus transfers.
Amoreeconomicalimplementation
ofadualbusstructureisdepictedin
Fig.
Each local bus may be connected to a
CPU, an lOP, or any combination of
processors.
Asystembuscontrollerlinkseach localbustoacommonsystembus.
The IO devices connected to the local lOP, as well as the local memory, are available to the
localprocessor.
If an lOP is connected directly to the system bus, the IO devices attached to it may be made
available to all processors. Only one processor can communicate with the shared memory and
other common resources through the system bus atanygiven time.
Theotherprocessorsarekeptbusycommunicatingwiththeirlocalmemoryand IOdevices.
PartofthelocalmemorymaybedesignedasacachememoryattachedtotheCPU
MultiportMemory
A multiport memory system employs separate buses between each memory module and each
CPU. This is shown in Fig. for four CPUs and four memorymodules (MMs).
Each processor bus is connected toeachmemory module.
A processor bus consists of the address,data, and control
lines required to communicate with memory.
The memory module is said to have four ports and each
port accommodates one of the buses. The module must
have internal control logic to determine which port will
have access to memoryatanygiven time.
Memory access conflicts are resolved by assigning fixed priorities to each memory port. The
priority for memory access associated with each processor may be established by the physical
port position thatits bus occupies in each module.
The advantage of the multi port memory organization is the high transfer rate that can be
achieved because ofthe multiple paths betweenprocessors and memory.
The disadvantage is that it requires expensive memory control logic and a large number of
cables and connectors.
CrossbarSwitch
The crossbar switch organization consists of a number of
cross points that are placed at intersections between
processor buses and memory module paths.
The small square in each cross point is a switch that
determines the path from a processor to amemory
module.
Each switch point has control logic to set up the transfer
path between a processor and memory.
It examines the address thatis placed in the bus to determine whetheritsparticularmodule is
being addressed.
It alsoresolvesmultiple requests for access to the samememory module on a predetermined
prioritybasis.
Thefunctionaldesignofacrossbarswitchconnectedtoonememorymoduleisshowninfigure.
Thecircuit consistsof multiplexersthat selectthe
data address, and control from one CPU for
communication with the memory module.
Priority levels are established by the arbitration
logic to select one CPU when two or more
CPUs attempt to access the same memory.
Acrossbarswitchorganizationsupports
simultaneous transfers from memory modules because there is a separate path associated with
each module.
MultistageSwitchingNetwork
The basic component of a multistage network is a two-
input, two-out interchange switch.
The switch has the capability of connecting input A to
either ofthe outputs.TerminalBofthe switchbehaves in a
similar fashion. The switch also has the capability to
arbitrate between conflicting requests.
Using the 2 x2 switchasa building block, it ispossible to
build multistage network to control the communication
between a number ofsources and destinations.
Consider the binary tree shown Fig. The twoprocessors
P1 and P2 are connected through switches toeight
memorymodules marked in binaryfrom000 through111.
The path from source to a destination is determined from
thebinary bits ofthedestination number.Thefirstbitof
the destination number determines the switch output in the first level. The second bit specifies
theoutputoftheswitchinthesecond level, andthethirdbit specifiestheoutputoftheswitchin the
third level.
Many different topologies have been proposed for multistage switching networks to control
processor-memory communication in a tightly coupled multiprocessor system or to control the
communication betweenthe processing elements ina looselycoupled system.
UNIT-V 13 GJAGANNAIKG
The processor whose arbiter has a PI = 1 and PO = 0 is the one that is given control of the
systembus
A processor may be in the middle of a bus operation when a higher priority processor requests
the bus. The lower-priority processor must complete its bus operation before it relinquishes
control of the bus.
When an arbiter receives control of the bus (because its PI = 1 and PO = 0) itexamines the
busy line. If the line is inactive, it means that no other processor is using the bus. The arbiter
activates the busy line and its processor takes control of the bus. However, if the arbiterfinds
the busy line active,it means that another processor is currentlyusing the bus.
The arbiter keeps examining the busyline while the lower-priorityprocessor that lost controlof
the bus completes its operation.
When the bus busy line returns to its inactive state, the higher-priority arbiter enables the busy
line, and its corresponding processor canthenconduct the required bus transfers.
ParallelArbitrationLogic
The parallel bus arbitration technique usesan
external priority encoder and a decoder as shown in
Fig. Each bus arbiter in the parallel scheme has a
bus requestoutput lineand a busacknowledge input
line.
Each arbiter enables the request line when its
processor is requesting access to the system bus.
The processor takes control of the bus if its
acknowledge input line is enabled.
DynamicArbitrationAlgorithms
A dynamic priority algorithm gives the system the capability for changing the priority of the
devices while the system is in operation.
The time slice algorithm allocates a fixed-length time slice of bus time that is offered
sequentially to each processor, in round-robin fashion. The service given to each system
component withthis scheme is independent of its locationalong the bus.
In a bus system that uses polling, the bus grant signal is replaced by a set of lines called poll
lines which are connected to all units. These lines are used by the bus controller to define an
address for each device connected to the bus.
When a processor that requires access recognizes its address, it activates the bus busy line and
thenaccessesthe bus. After anumber ofbuscycles, thepollingprocesscontinues bychoosinga
different processor. The polling sequence is normally programmable, and as a result, the
selection prioritycan be altered under programcontrol.
The least recently used (LRU) algorithm gives the highest priority to the requesting device
that has not used the bus for the longest interval. The priorities are adjusted after a number of
bus cycles according to the LRU algorithm.
In the first-come, first-serve scheme, requests are served in the order received. To implement
this algorithm, the bus controller establishes a queue arranged according tothe time thatthe bus
requests arrive. Each processor must wait forits turn to use the bus on a first-in,first-out (FIFO)
basis.
The rotating daisy-chain procedure is a dynamic extension of the daisy chain algorithm. In this
scheme there is no centralbus controller, and the priorityline is connected fromthe priority-out
ofthe last device back tothe priority-inofthe first device in a closedloop.
Each arbiter priority for a given bus cycle is determined by its position along the bus priority
line from the arbiter whose processor is currently controlling the bus. Once an arbiter releases
the bus,it has the lowest priority.
INTERPROCESSORCOMMUNICATIONANDSYNCHRONIZATION
The various processors in a multiprocessor system must be provided with a facility for
communicating with each other. A communication path can be established through common
input-output channels.
In a shared memory multiprocessor system, the mostcommon procedure is to setaside a portion
of memorythat is accessible to allprocessors. The primary use ofthe common memory is to act
asa message center similar to a mailbox, where each processor can leave messages for other
processors and pick up messages intended for it.
The sending processor structures a request, a message, or a procedure, and places it in the
memory mailbox. Status bits residing in common memory are generally used to indicate the
condition of the mailbox, whether it has meaningful information, and for which processor it is
intended.
The receiving processor can check the mailbox periodically to determine if there are valid
messages for it. The response time of this procedure can be time consuming since a processor
willrecognize a request only when polling messages.
A more efficient procedure is for the sending processor to alert the receiving processor directly
by means of an interrupt signal. This can be accomplished through a software-initiated
interprocessor interrupt by means ofan instruction in the programofone processor which when
executed produces an external interrupt condition in a second processor. This alerts the
interrupted processorofthe fact that anew message was inserted bythe interrupting processor.
In addition to shared memory, a multiprocessor system may have other shared resources. For
example, a magnetic disk storage unit connected to an lOP may be available to all CPUs. This
provides a facility for sharing ofsystemprograms stored inthe disk.
A communication path between two CPUs can be established through a link between two lOPs
associated with two different CPUs. This type of link allows each CPU to treat the other as an
IO device so that messages can be transferred through the IO path.
To prevent conflicting use of shared resources by several processors there must be a provision
for assigning resourcesto processors. This task is giventothe operating system. There arethree
organizations that have been used in the design of operating system for
multiprocessors:master-slave configuration, separate operating system, and distributed
operating system.
In a master-slave mode, one processor, designated the master, always executes the operating
system functions. The remaining processors, denoted as slaves, donotperform operating system
functions. If a slave processor needs an operating system service, it must request it by
interrupting the master and waiting untilthe current programcan be interrupted.
In the separate operating system organization, each processor can execute the operating system
routines it needs. This organization is more suitable for loosely coupled systems where every
processor may have its own copyofthe entire operating system.
In the distributed operating system organization, the operating system routines are distributed
among the available processors. However, eachparticular operating systemfunction isassigned
to only one processor at a time. This type of organization is also referred to as a floating
operating systemsince the routines float fromone processorto another andthe executionofthe
routines may be assigned to different processors at different times.
Ina looselycoupled multiprocessorsystemthe memoryis distributedamongtheprocessorsand
there is no shared memory for passing information.
The communication between processors is by means of message passing through IO channels.
The communication is initiated byone processor calling a procedurethat resides inthe memory
of the processor with which it wishes to communicate. When the sending processor and
receiving processor name eachother as a source and destination, a channelofcommunication is
established.
A message is then sent with a header and various data objects used to communicate between
nodes. There maybea numberofpossiblepaths availableto sendthe message betweenanytwo
nodes.
The operating systemin each node contains routing information indicating the alternative paths
that can be used to send a message to other nodes. The communication efficiency of the
interprocessor network depends on the communication routing protocol, processor speed, data
link speed, and the topology of the network.
InterprocessorSynchronization
The instruction set of a multiprocessor contains basic instructions that are used to implement
communication and synchronization between cooperating processes.
Communication refers to the exchange of data between different processes. For example,
parameters passed to a procedure in a different processor constitute interprocessor
communication.
Synchronization refers to the special case where the data used to communicate between
processors is control information. Synchronization is needed to enforce the correct sequence of
processes and to ensure mutually exclusive access to shared writable data.
Multiprocessor systemsusually include various mechanismsto dealwiththe synchronizationof
resources.
Low-level primitives are implemented directly by the hardware. These primitives are the basic
mechanisms that enforce mutual exclusion for more complex mechanisms implemented in
software.
Anumberofhardware mechanismsformutualexclusionhavebeendeveloped.
One of the most popular methods is through the use of a binary semaphore. Mutual Exclusion
with a Semaphore
A properly functioning multiprocessor system must provide a mechanism that will guarantee
orderly access to shared memory and other shared resources.
This is necessaryto protect datafrombeing changed simultaneously bytwoor moreprocessors.
This mechanism has been termed mutual exclusion. Mutual exclusion must be provided in a
multiprocessor system to enable one processor to exclude or lock out access toa shared
resource byother processors when itis ina critical section.
A critical section is a program sequence that, once begun, must complete execution before
another processor accesses the same shared resource.
A binary variable called a semaphore is often used to indicate whether or not a processor is
executing a critical section. A semaphore is a software controlled flag thatis stored in amemory
location thatall processors can access.
When the semaphore is equalto 1, it means that a processor is executing a critical program, so
that the shared memory is not available to other processors.
When the semaphore is equal to 0, the shared memory is available to any requesting processor.
Processors that share the same memory segment agree by convention not to use the memory
segment unless the semaphore is equal to 0, indicating that memory is available . They also
agree to set the semaphore to 1 when they are executing a critical section and to clear it to 0
when they arefinished.
Testingandsettingthesemaphore is itselfacriticaloperationand must beperformedasa single
indivisibleoperation. Ifit is not,twoor moreprocessors maytestthesemaphoresimultaneously and
theneachset it, allowing themto enter acriticalsectionat thesametime. Thisactionwould allow
simultaneous execution of critical section, which can result in erroneous initialization of
controlparameters and a loss of essential information.
A semaphore can be initialized by means of a test and set instruction in conjunction with a
hardware lock mechanism.
A hardware lock is a processor generated signal that serves to prevent other processors from
using the system bus as long as the signal is active. The test-and-set instruction tests and sets a
semaphore and activates the lock mechanism during the time that the instruction is being
executed.
This prevents other processors from changing the semaphore between the time thatthe
processor is testing it and the time that it is setting it. Assume that the semaphore is a bit in the
leastsignificantpositionofa memoryword whose address is symbolized bySEM.
Let the mnemonic TSL designate the "test and set whilelocked" operation.The instructionTSL
SEM will be executed in two memory cycles (the first to read and the second to write) without
interference as follows:
R M[SEM] Testsemaphore
M[SEM]1 Setsemaphore
The semaphore is tested bytransferring itsvalue to a processorregister R and then it isset to 1.
The value in R determines what to do next.
Iftheprocessor findsthat R=1, it knowsthatthesemaphorewasoriginallyset. (The fact that it is set
again does not change the semaphore value.) That means that another processor is executing a
critical section, so the processor that checked the semaphore does not access the shared
memory.
If R = 0, it means that the common memory (or the shared resource that the semaphore
represents) is available. The semaphore is set to 1 to prevent other processors from accessing
memory. The processor can now execute the criticalsection.
The last instruction in the program must clear location SEM to zero to release the shared
resource to otherprocessors.Note thatthelocksignal mustbeactive duringtheexecution of thetest-
and-set instruction. It does not have to be active once the semaphore is set.
Thus the lock mechanism prevents other processors from accessing memory while the
semaphore is being set. The semaphore itself, when set, prevents other processors from
accessing shared memorywhile one processor is executing a critical section.