Server+ Study Guide
Server+ Study Guide
Study Guide
www.sybex.com
Server +TM
Study Guide
Gary Govanus
with William Heldman
Jarret Buse
www.sybex.com
www.sybex.com
Neil Edde
Associate PublisherCertification
Sybex, Inc.
www.sybex.com
SYBEX Inc.
Customer Service Department
1151 Marina Village Parkway
Alameda, CA 94501
(510) 523-8233
Fax: (510) 523-2373
e-mail: [email protected]
WEB: HTTP://WWW.SYBEX.COM
After the 90-day period, you can obtain replacement
media of identical format by sending us the defective disk,
proof of purchase, and a check or money order for $10,
payable to SYBEX.
Disclaimer
SYBEX makes no warranty or representation, either expressed or
implied, with respect to the Software or its contents, quality, performance, merchantability, or fitness for a particular purpose. In
no event will SYBEX, its distributors, or dealers be liable to you
or any other party for direct, indirect, special, incidental, consequential, or other damages arising out of the use of or inability to
use the Software or its contents even if advised of the possibility of
such damage. In the event that the Software includes an online
update feature, SYBEX further disclaims any obligation to provide this feature for any specific duration other than the initial
posting.
The exclusion of implied warranties is not permitted by some
states. Therefore, the above exclusion may not apply to you.
This warranty provides you with specific legal rights; there may
be other rights that you may have that vary from state to state.
The pricing of the book with the Software by SYBEX reflects the
allocation of risk and limitations on liability contained in this
agreement of Terms and Conditions.
Shareware Distribution
This Software may contain various programs that are distributed
as shareware. Copyright laws apply to both shareware and ordinary commercial software, and the copyright Owner(s) retains all
rights. If you try a shareware program and continue using it, you
are expected to register it. Individual programs differ on details of
trial periods, registration, and payment. Please observe the
requirements stated in appropriate files.
Copy Protection
The Software in whole or in part may or may not be copy-protected or encrypted. However, in all cases, reselling or redistributing these files without authorization is expressly forbidden
except as specifically provided for by the Owner(s) therein.
www.sybex.com
www.sybex.com
Acknowledgments
You know, this is the toughest part of the entire book to write. You may not
believe it, but it is true. So many people have done so much to get this book on
the shelf and into your hands that it is not possible for me to list them all. I also
have the task of trying to have you understand how important all of those people are to this project. Believe me, it is much easier explaining how Ethernet or
a router works than to try to explain the differences between an acquisition
and development editor, a production editor, and an editor!
Most importantly, I would like to thank my wife, Bobbi, for all her love
and understanding during this process. I have been writing for Sybex almost
continuously for two years, and she has been wonderful during the whole
time. It is a lot harder than it sounds, because when there are deadlines, or
I am trying to teach and to write at the same time, something has to give, and
usually it is the attention I pay to her. She knows how much I love doing this,
so she puts her needs on the back burner. She really is a wonderful woman,
and I am very lucky to have her in my life.
There are others who get shortchanged when I write. I sometimes have to
really work to find the time to make my daughters, Dawn and Denise, crazy.
Fortunately, they are now old enough where they have very full, successful
lives of their own. I dont get to see my grandkids nearly enough, so as soon
as I finish this thing, I am taking them to Disney World. Brandice has been
there before (several times), but CJ and Courtney havent so it will be a treat
for Poppy to see the wonder in their eyes. My parents have not had as much
of my time as they deserve either, and for that I am sorry. Finally, there is my
best friend, John Hartl. John is a quiet man, but can do a wonderful job of
laying guilt. He did it when he pointed out that seeing your best friend once
a year was not enough, and he was tired of me using the *&^% book as an
excuse. He is right!
Now for the people on the production team. This is the first book I have
done with Elizabeth Hurley. Since she approached me with this project, she
has been promoted to an acquisition and developmental editor, a position
she richly deserves. Soon she will be running the place. Her good humor and
infectious laugh always can brighten my day. I hope that this book will justify the faith she has had in me. Every time I came to her with a question, she
would say, Gary, you do what you think is best, I trust you completely.
You have no idea how close we came to changing this book into the novel I
always wanted to write!
www.sybex.com
www.sybex.com
Introduction
he Server+ certification tests are sponsored by the Computing Technology Industry Association (CompTIA) and supported by several of the
computer industrys biggest vendors (for example, Compaq, IBM, and
Microsoft). This book was written to provide you with the knowledge you
need to pass the exam for Server+ certification. Server+ certification gives
employers a benchmark for evaluating their employees knowledge. When
an applicant for a job says, Im Server+ certified, the employer can be
assured that the applicant knows the fundamental server and networking
concepts. For example, a Server+ certified technician should know the difference between the various types of hard disk subsystems and how to configure them, the differences between various server types, and the
advantages and disadvantages of different network operating systems.
This book was written at an intermediate technical level; we assume that you
already know some of the information in the A+ certification and know about
hardware basics. The exam itself covers basic server topics as well as some more
advanced issues, and it covers some topics that anyone already working as a
technician, whether with computers or not, should be familiar with. The exam
is designed to test you on these topics in order to certify that you have enough
knowledge to intelligently discuss various aspects of server operations.
Weve included review questions at the end of each chapter to give you a
taste of what its like to take the exam. If youre already working as a network
administrator, we recommend you check out these questions first to gauge
your level of knowledge. You can begin measuring your level of expertise by
completing the assessment test at the end of this Introduction. Your score will
indicate which areas need improvement. You can use the book mainly to fill
in the gaps in your current knowledge of servers.
If you can answer 80 percent or more of the review questions correctly for
a given chapter, you can probably feel safe moving on to the next chapter. If
youre unable to answer that many correctly, reread the chapter and try the
questions again. Your score should improve.
Dont just study the questions and answersthe questions on the actual
exam will be different from the practice ones included in this book and on the
CD. The exam is designed to test your knowledge of a concept or objective, so
use this book to learn the objective behind the question.
www.sybex.com
xxvi
Introduction
www.sybex.com
Introduction
xxvii
www.sybex.com
xxviii
Introduction
www.sybex.com
Introduction
xxix
It is possible to pass this test without any reference materials, but only if
you already have the knowledge and experience that come from reading
about and working with servers. Even experienced server people tend to
have what you might call a 20/80 situation with their computer knowledgethey may use 20 percent of their knowledge and skills 80 percent of
the time, and rely on manuals, guesswork, the Internet, or phone calls for the
rest. By covering all the topics that are tested by the exam, this book can help
you refresh your memory concerning topics that, until now, you seldom
used. (It can also serve to fill in gaps that, lets admit, you may have tried to
cover up for quite some time.) Further, by treating all the issues that the
exam covers, this book can serve as a general field guide, one that you may
want to keep with you as you go about your work.
In addition to reading the book, you might consider practicing these objectives
through an internship program. (After all, all theory and no practice make for a
poor technician.)
www.sybex.com
xxx
Introduction
To test your knowledge as you progress through the book, check out the
review questions at the end of each chapter. As you finish each chapter,
answer the review questions and then check to see if your answers are right
the correct answers appear on the page following the last review question.
You can go back to reread the section that deals with each question you got
wrong to ensure that you get the answer correctly the next time you are
tested on the material.
On the CD-ROM youll find two sample exams. You should test your
knowledge by taking the practice exam when you have completed the book
and feel you are ready for the Server+ exams. Take this practice exam just as
if you were actually taking the Server+ exam (i.e., without any reference
material). When you have finished the practice exam, move on to the bonus
exam to solidify your test-taking skills. If you get more than 90 percent of the
answers correct, youre ready to go ahead and take the real exam.
The CD-ROM also includes several extras you can use to bolster your
exam readiness:
Electronic flashcards You can use these 150 flashcard-style questions to
review your knowledge of Server+ concepts. They are available for PCs
and handheld devices. You can download the questions right into your
Palm device for quick and convenient reviewing anytime, anywhere
without your PC!
Test engine The CD-ROM includes all of the questions that appear in this
book: the assessment questions at the end of this introduction and all of the
chapter review questions. Additionally, it includes a practice exam and a
bonus exam. The questions appear much like they did in the book, but you
can also choose to randomize them. The randomized test will allow you to
pick a certain number of questions to be tested on, and it will simulate the
actual exam. Combined, these test engine elements will allow you to test your
readiness for the real Server+ exam.
Full text of the book in PDF If you are going to travel but still need to
study for the Server+ exam and you have a laptop with a CD-ROM drive, you
can take this entire book with you just by taking the CD-ROM. This book is
in Adobe Acrobat PDF format so it can be easily read on any computer.
www.sybex.com
Introduction
xxxi
www.sybex.com
xxxii
Introduction
Job Dimension
% of Exam (approximate)
1.0: Installation
17%
2.0: Configuration
18%
3.0: Upgrading
12%
9%
5.0: Environment
5%
27%
12%
1.0: Installation
1.1 Conduct pre-installation planning activities:
Verify that all correct components and cables have been delivered.
1.2 Install hardware using ESD best practices (boards, drives, processors,
Install UPS.
www.sybex.com
Introduction
xxxiii
2.0: Configuration
2.1 Check/upgrade BIOS/firmware levels (system board, RAID, control-
3.0: Upgrading
3.1 Perform full backup:
Verify backup.
Verify N 1 stepping.
www.sybex.com
xxxiv
Introduction
www.sybex.com
Introduction
xxxv
www.sybex.com
xxxvi
Introduction
5.0: Environment
5.1 Recognize and report on physical security issues:
www.sybex.com
Introduction
xxxvii
Interpret error logs, operating system errors, health logs, and critical
events.
Locate and effectively use hot tips (e.g., fixes, OS updates, E-support,
Web pages, CDs).
6.3 Identify bottlenecks (e.g., processor, bus transfer, I/O, disk I/O, network
I/O, memory).
6.4 Identify and correct misconfigurations and/or upgrades.
6.5 Determine if problem is hardware, software, or virus related.
www.sybex.com
xxxviii
Introduction
Use the technique of hot swap, warm swap and hot spare to
ensure availability.
Use the concepts of fault tolerance/fault recovery to create a disaster recovery plan.
7.2 Restoring.
Bring two forms of ID with you. One must be a photo ID, such as a
drivers license. The other can be a major credit card or a passport.
Both forms must have a signature.
Arrive early at the exam center so you can relax and review your study
materials, particularly tables and lists of exam-related information.
www.sybex.com
Introduction
xxxix
On form-based tests, because the hard questions will eat up the most time,
save them for last. You can move forward and backward through the
exam. When the exam becomes adaptive, this tip will not work.
For the latest pricing on the exams and updates to the registration procedures, call Prometric at (800) 755-EXAM (755-3926) or (800) 77-MICRO
(776-4276). If you have further questions about the scope of the exams or
related CompTIA programs, refer to the CompTIA site at www.comptia.org/.
www.sybex.com
Assessment Test
1. In a Fibre Channel configuration, what constitutes a point-to-point link?
A. Arbitrated loop
B. Fabric
C. A bidirectional link that connects the N_ports on two nodes
D. Two NL_ports connected to two FL_ports
2. What do you call a list of IP addresses that can be assigned by an auto-
www.sybex.com
xli
Assessment Test
wiser.
B. An error message pops up on the screen describing the error to the
end user and giving the user a chance to fix the problem.
C. An entry is made in the memory error log, but the system continues
to operate.
D. The system is halted.
8. What happens when a parity-checking memory module determines
the wiser.
B. An error message pops up on the screen describing the error to the
end user and giving the user a chance to fix the problem.
C. An entry is made in the memory error log, but the system continues
to operate.
D. The system is halted.
9. How many interrupts are available with PCI?
A. 64
B. 32
C. 16
D. 8
10. How can you configure load balancing in a PCI Bridged environment?
A. Configure one bridge as a master, and the other as a slave.
B. You will have to buy special devices to make this work.
C. You will have to purchase a special connector.
D. Load balancing is not recommended in bridged environment.
www.sybex.com
Assessment Test
www.sybex.com
xlii
xliii
Assessment Test
16. Four network cards grouped together for Load Balancing will have
check to make sure the database is getting backed up, so you try to
restore one of the files to another server. You find the file was not
backed up. What is a likely reason for this happening?
A. The file had not been accessed that day.
B. The tape backup program cannot back up open files.
C. The tape backup program cannot back up files that big.
D. The tape backup program did not run.
www.sybex.com
Assessment Test
xliv
must be configured?
A. A DNS Server
B. A relay Agent
C. Another DHCP Server
D. SMTP
E. DMI
25. Name three ways NICs can work together.
A. Adapter Grouping
B. Adapter Fault Tolerance
C. Adapter Virtual Private Networks
D. Adapter Load Balancing
E. Adapter Teaming
www.sybex.com
xlv
Assessment Test
doing.
B. They provide a background of what has been done to a computer.
C. They provide an instruction manual for doing routine tasks.
30. What is the plenum?
A. The type of metallic shielding surrounding a fiber optic cable
B. The type of cable used in fiber optic installations
C. The air space between the ceiling and the actual roof of a building
D. Precious metal like gold
31. With which Internet standard protocol is Active Directory accessed?
A. SNMP
B. SMTP
C. LDAP
D. POP3
www.sybex.com
Assessment Test
xlvi
32. Will an ATA 100 device use the same type cable as an ATA 66 device?
A. No
B. Yes
33. A BNC connector is used on what type of Ethernet implementation?
A. ThinNet
B. Thicknet
C. UTP
D. STP
34. How many terminators are there on a ThinNet network?
A. One
B. Two
C. One for every 50 hosts
D. One for every 100 hosts
35. Which is true of fiber optics?
A. It is affected by EMI.
B. It is affected by heat.
C. The cable can be made of glass.
D. The cable is always made of copper.
www.sybex.com
xlvii
match.
4. B. You would have three 20GB drives for storage and one 20GB drive
for parity. Therefore, you would have 60GB of usable storage space.
5. A. The performance of the RISC processor depends on the code it is
executing.
6. B. Memory interleaving is a way of quickly getting access to informa-
will be detected and corrected. ECC memory can determine corruption of up to 4 bits, but with anything over 1 bit, the system is halted.
8. D. With parity, if it is determined there has been some corruption, the
system is halted.
9. C. PCI can use up to 16 interrupts.
10. D. If you are using a bridged architecture, load balancing is not rec-
ommended.
11. B, D, F. I2O is made up of three software layers: the OS Services Mod-
ule (OSM), the I2O Messaging Layer (IML), and the Hardware Device
Module (HDM).
12. A, B, D. IDE devices can be a master with no slave present, a master
IP address.
www.sybex.com
xlviii
segments.
19. C. Fault Tolerance requires at least two cards, not at least two ports.
20. B. Many tape backup programs are not capable of backing up open
chain.
22. A. A peer to peer application server would be the type that may be
used by gamers.
23. A, B. There are only two types of cache, L1 and L2.
24. B. A relay Agent must be configured.
25. B, D, E. Adapters can work together with Load Balancing, Fault Tol-
erance, or Teaming.
26. D. PIO is the abbreviation for Programmed Input/Output.
27. D. The U rating is the number of mounting holes that the device will
a computer.
30. C. The plenum is the space between the drop down ceiling and the
www.sybex.com
Chapter
Disk Subsystems
SERVER+ EXAM OBJECTIVES COVERED IN
THIS CHAPTER:
1.2 Install hardware using ESD best practices (boards, drives,
processors, memory, internal cable, etc.).
Install UPS.
www.sybex.com
Use the technique of hot swap, warm swap, and hot spare to
ensure availability.
7.2 Restoring
www.sybex.com
ont you just hate it? You buy the darn book, hoping to be
slowly and gently eased into the studying process, and in the very first chapter
the author nails you with a ton of objectives. Well, take heart, if you have
passed the Network+ test and the A+ test, about 20% of the material in this
book will be old hat! You will already know it.
The Server+ exam is designed to give you background into the inner
workings of your local server platform. Throughout this book we are going
to be talking about the different types of hardware that make up a server, the
different types of servers that can be put into your network, the different
types of server operating systems, how to care for your servers, and how to
fix them if they break. I suppose we could subtitle this book, The Care and
Feeding of Network Servers.
To make this daunting task a little easier, we are going to break it down
into chunks. As you can see, the first chunk deals with the disk subsystem of
the server. In this first chapter, we will cover four basic areas: Logical and
Physical Drives, SCSI, RAID, and hot swappable. Well spend some time
hashing out terminology, discussing strengths and weaknesses, and looking
at fault tolerances. So lets get to it.
For complete coverage of objective 1.2, please also see Chapters 6, 7, and 8. For
complete coverage of objective 3.3, please also see Chapter 2. For complete coverage of objective 3.6, please also see Chapters 2 and 10. For complete coverage
of objectives 7.1 and 7.2, please also see Chapter 12.
www.sybex.com
Chapter 1
Disk Subsystems
In this section, we are going to talk about Logical Drives and Physical
Drives and describe their functionality. We will take a look at how the people
who use the server view the disk subsystem.
Now, if your users are like all the users I have ever dealt with, 95% of
them dont care how or why the network operatesthey just want it to work
every time. Does that sound familiar? When it gets into the subject of drives,
they could care less, as long as they can store and retrieve their information.
And that is just the way it should be.
Logical Drives
Every network that I have ever worked on has had a drive letter mapped to
an area that was fondly referred to as the Users home directory. Depending
on the network operating system you are using, it may be called the Users
directory, or the mount point, or something else esoteric, but every system
has one. This is the place where your users can store their highly personal,
private user stuff. You know, like all the jokes they received via e-mail over
the weekend. Anyway, to make this drive easier for users to access, it is
assigned a drive letter; for instance, in my network it is the H: drive. Now,
users dont refer to this area as their Users home directorythey simply call
it their H: drive. Well, what exactly is their H: drive?
If you were to ask an end user that question, he would probably tell you
that somewhere, back in the deep dark reaches of the computer room, there
would be a wall. Mounted on this wall would be dozens of physical hard
drives and each one of these hard drives would have a little plastic strip on
it, with a name. So, if you wanted to find Elizabeths H: drive, you would
simply find the wall, look to the strips that handled the Es, and there, about
halfway down, would be the drive for Elizabeth. In reality, the Users home
directory is just thatit is a directory that is part of a much larger directory
structure. Using Microsoft Explorer, Figure 1.1 is a small sample of a Users
directory.
www.sybex.com
FIGURE 1.1
Now, I suppose right from the start we should get the biases out in the open.
First of all, I have been working with computers since the 80s, back when DOS
was king and GUI was something slimy. So, I have some problems that you
should know about. The first is with the interchangeable term directories and
folders. In the world of GUI, a folder represents a storage area created on a
disk for the storage and retrieval of information. In the world of DOS, the same
thing was called a directory. Being an old dog, I find it hard to learn new tricks,
so if you see the word directory, and you are more comfortable with folder, go
for it. In my world, in this context, they are interchangeable.
Depending on the network operating system you are using, these areas are
referred to as mapped drives, shares, or mount points. It all amounts to the
same thing. A drive letter has been assigned as a pointer to a particular directory or folder on a bigger physical device. It does not even have to be a network operating system. DOS will support up to 23 Logical Drives on a
www.sybex.com
Chapter 1
Disk Subsystems
system. As far as the user is concerned, it is a drive just like the C: drive in
their computer. As far as you are concerned, it is a Logical Drive, or drive letter that has been assigned as a pointer to a distinct directory or folder on a
larger physical device.
Physical Drives
If Logical Drives are pointers to directories or folders on physical devices, it
makes sense that the Physical Drive is what you can hold on to and install
into a file server. A Physical Drive can be a hard drive, a floppy drive, or even
removable storage.
These drives come in a variety of sizes, from the standard 1.44MB floppy
drive to the hard drives that can go over 30 gigabytes. The hard drives also
come in a variety of different technologies and configurations, with a wide
variety of different acronyms that you are going to have to be familiar with
things like IDE, EIDE, ATA, SCSI, RAID, hot swap, hot plug, and even hot
spare. Over the rest of this chapter and the next chapter, we will be talking
about the differences between these.
www.sybex.com
There are several limitations to the way EIDE handles devices. The major
limitations are the number of devices that can be controlled from a single paddle
card and the lack of redundancy. If one device fails, your entire subsystem is
destroyed. Small Computer System Interface (SCSI), on the other hand,
addresses those issues. SCSI is just another type of interface that is much more
extensible than IDE. Besides hard drives, SCSI will work with CD-ROMs and all
sorts of other wonderful things. Even from its early days, SCSI (pronounced
scuzzy) allowed you to have an unlimited number of devices strung together in
what is referred to as a daisy chain. SCSI was originally designed as a high-speed
system-level parallel interface. SCSI has evolved, and now there are all sorts of
different levels and speeds. We will explore each of them in this section.
Lets start with the definitions of the interfaces and how they are used.
When we start talking about synchronous and asynchronous, the term clocked
comes into play. If you are not familiar with what it means, think of your dial-up
connection to the Internet. In this case, you are using an asynchronous modem.
Communication occurs randomly, when and where you want it to. Since there is
no regularity, this is not clocked. With synchronous, whenever a communication
link is established, certain tasks are carried out at regularly timed intervals, and
therefore they are clocked.
www.sybex.com
Chapter 1
Disk Subsystems
SCSI controller
Device 1
Device 2
Device 3
Device 4
As you can see, we have a SCSI controller and four SCSI devices. These
devices could be four hard drives or three hard drives and a tape drive, or
two hard drives, a tape drive, and a CD-ROM. You get the point. Anyway,
suppose the computer sent messages to the controller to write information to
Device 1 and to Device 3. With SCSI, the controller would send a signal to
Device 1 telling it that the controller had some work for it to do. Device 1
would respond, and the controller would send the information. Device 1
would send back an acknowledgement and the controller would now go on
to the information that had to be written to Device 3. The trick here is that
the controller would do things one step at time, and could not multitask.
That was changed in SCSI-2.
So, how does the controller know which device is which? Well, just like in
the diagram, each device is assigned its own unique device number, called the
SCSI address. The entire SCSI subsystem is referred to as the SCSI bus. The
SCSI address was configured in a variety of ways, including jumpers or rocker
switches. That way, when the controller needed to send information to that
device, it just used the appropriately addressed wire. In order to keep the signals on the wire, each SCSI bus had to be terminated at both ends. We will talk
more about termination after we get through defining the types of SCSI.
Not only is SCSI flexible in the kinds of computers it can work in, SCSI
is also flexible in the kinds of devices it can work with. For example, SCSI
can work with tape drives, hard drives, and CD-ROMs, to name a few.
These devices can be internal to the computer or external, in a separate case.
If the devices were internal, they would use a 50-pin ribbon cable. If the
devices were external, they would use a very thick, shielded cable that had a
Centronics 50-pin adapter on one end and a DB-25 connector on the other.
www.sybex.com
A SCSI-2 connector
www.sybex.com
10
Chapter 1
Disk Subsystems
Now just to confuse you, when they widened the data transfer path from
8 to 16 bits, the 16-bit version of SCSI-2 was also referred to as Wide SCSI.
With the wider bus, the transfer speed climbed to 10.0 Mbytes/second; this
was referred to as Fast SCSI. With Fast SCSI, there was also some new terminologyinstead of Mbytes/second, there is Mega Transfer (MT). The MT
is a unit of measurement that refers to the rate of signals on the interface,
regardless of the width of the bus. So, as an example, if you have a 10MT
rate on a Narrow SCSI bus, the transfer rate would be 10 Mbytes/second. If,
however, you had the same 10MT rate on a Wide bus, it would result in a
20 Mbyte/second transfer rate. The developers finally took the Wide SCSI
technology and combined it with Fast SCSI, and that became Fast-Wide
SCSI, with a transfer speed of 40 Mbytes/second.
SCSI-2 was backwardly compatible with SCSI, but for maximum benefit, it
was suggested that you stick with one technology or the other, preferably
using a SCSI-2 controller with SCSI-2 devices. With both SCSI-1 and SCSI-2,
the number of peripherals that could be connected to any controller was seven.
SCSI-3
Although SCSI is maturing, it is not completely there yet. There are still some
limitations, like having no more than seven devices connected to any controller.
The next generation of SCSI, SCSI-3, takes care of some of that.
Now SCSI-3 is still a proposed ANSI standard, but there are a lot of devices
out there purporting to be SCSI-3. That is because the SCSI-3 documentation
took the very large SCSI-2 specifications (in excess of 400 pages) and split it
into smaller bite size chunks. These smaller documents cover different layers of
how the interface will be defined. For example, the following layers are
included:
physical, which covers things like the connectors, the pin assignments,
and the electrical specifications. This document is called SCSI Parallel
Interface (SPI).
protocol, which covers all the physical layer activity and how it is
organized into bus phases and packets.
www.sybex.com
11
Now, when the standards folks started working on this, they recognized
how quickly things were changing, so they layered the specifications to allow
substitution in different parts of the specifications as the technology evolves.
One example would be the standards for the SCSI Fibre Channel interface disk
drive. In this case, the physical and protocol layers would have to be replaced
with new documents, but the other three layers could remain the same.
So, since the newest features are going to show up in SCSI-3, and since
SCSI-3 will be generally higher-performing, you can expect that a SCSI3 device will exhibit better performance than its SCSI-2 brethren. One of
the first things people realized with SCSI-3 was that the number of
peripherals changed. Now, you could have a maximum of 16 devices.
Since there was the possibility of having 16 devices on the chain, the
length of the cable had to increase also. SCSI-3 also saw the added support for a serial interface and for a fiber optic interface. Data transfer
rates depended on the way the hardware was implemented, but the data
rates could actually climb to hundreds of megabytes per second.
Now it is time to get into some of the ways the SCSI-3 standards are
broken up.
Dont you wish they would come up with just one name for this stuff and
stick with it?
www.sybex.com
12
Chapter 1
Disk Subsystems
Ultra320
Ultra320 SCSI is the one that is not off the drawing board yet, but it is going
to feature data transfer rates up to 320 Mbytes/second. Ultra320 was first
defined in SPI-4.
www.sybex.com
13
If the cable is not Single Ended, it is Differential SCSI. Differential SCSI comes in
Low Voltage Differential or High Voltage Differential, and these devices are not
compatible on the same bus segment without an electronic device such as a SCSI
converter to convert between Single Ended and Differential. With rare exception,
no software (driver) modifications are necessary for conversion between Single
Ended and Differential. There are several variations of terminators developed for
use with Single Ended SCSI and Differential SCSI.
So, what this means to you the server administrator is confusion. See, the
cable that is used for Single Ended SCSI and the cable that is used for Differential SCSI look the same, even though they are electrically different. To
www.sybex.com
14
Chapter 1
Disk Subsystems
make matters worse, both Single Ended and Differential can use each of the
cable types listed in Table 1.1.
TABLE 1.1
Characteristics
Type A cable
Type B cable
Type P cable
Firewire cable
Figure 1.4 shows the different types of cable ends for different types of
SCSI devices. About the only way to tell the difference between Single Ended
Devices and Differential Devices is with the judicious use of a volt/ohm
meter.
www.sybex.com
FIGURE 1.4
15
Connector
www.sybex.com
16
Chapter 1
Disk Subsystems
Speed
Connector
SCSI-1
(AKA 8-bit or Narrow)
5 Mbytes/second
10 Mbytes/second
50-pin high-density,
used for things like
Iomega JAZ drives or
writable CD-ROMs.
Ultra SCSI
(8-bit Narrow)
20 Mbytes/second
50-pin high-density,
used for things like
Iomega JAZ drives or
writable CD-ROMs.
Wide SCSI
(16-bit Wide)
20 Mbytes/second
68-pin high-density,
used with hard disk
drives
40 Mbytes/second
Ultra 2 SCSI
(16-bit Wide)
80 Mbytes/second
68-pin high-density,
used with disk drives
Ultra160 SCSI
(16-bit Wide)
160 Mbytes/second
www.sybex.com
SCSI Termination
17
So, now that you know all about the different types of SCSI, it is time to
ask yourself why it is important. First of all, it is extensible. If you have
worked around networking for any length of time at all, you know that there
is no such thing as too much disk space. If you run out of disk space, it is nice
to know that you can add on another drive, or group of drives, without much
hassle. Cost may be another thing, but without much hassle. There wont be
much hassle as long as you understand termination.
SCSI Termination
long, long time ago, when I first started playing with hardware, the
term SCSI was sometimes enough to bring fear and trepidation into the
hearts of the best of technicians, all because of a couple of small terminating
resistors.
Earlier we mentioned that a SCSI chain has to be terminated at both ends.
That sounds really easy and simple. Sometimes, when you have a combination
of several internal devices connected to several external devices, it is not the
easiest of jobs to locate the end of a chain. In addition, a small resistor that was
plugged into the device terminated some devices, jumpers or DIP switches terminated some devices, and sometimes it was a combination of the two. Some
devices had terminating resistors, which were large and silver and difficult to
lose. So, you always had to remember the basics of SCSI troubleshooting.
Problems are usually caused by termination. When in doubt, break down the
chain and add one device at a time until you find the device that is causing the
problem, or until you get the entire chain working. It could lead to a trying
day. Things have gotten better: Some devices now are self-terminatingthey
just sense if they are at the end of the chain and terminate themselves.
SCSI termination is just the electrical circuitry, which is installed at the end
of a cable that is designed to match impedances for the purpose of preventing
the reflection of electrical signals when they reach the end of the cable. In SCSI,
this is done with a device called a terminator.
When working with any SCSI bus segment, remember there should be
two terminators and only two terminators. Not one, not three, but two
terminators. Also, the terminators must be installed at the very ends of
the SCSI cable, not at devices in the middle of the bus.
www.sybex.com
18
Chapter 1
Disk Subsystems
When you talk about SCSI termination, there are four basic types: Passive,
Active, Force Perfect Termination (FPT), and LVD (including LVD/MSE).
Lets explore them one by one:
Passive The simplest form of termination is referred to as Passive. The
terminator consists of a 220-ohm resistor that goes from the TERMPWR
to the signal line and another 330-ohm resistor that goes from the signal
line to ground. This form of termination does not cost much, but there are
disadvantages. For example, if there is a fluctuation in the TERMPWR
voltage, it will show up on the signal lines of the bus. That may be enough
to cause data errors. If your system is using SCSI-2, it is recommended
that you use Active terminators whenever possible for Single Ended SCSI.
Passive terminators are always used with differential (HVD) SCSI.
Active If the termination is not Passive, it must be taking an Active role.
Active termination is referred to as Alternative 2 in SCSI-2. Active termination was developed because of the problems with Passive termination.
To solve those problems, Active terminators have a voltage regulator.
This regulator serves to reduce the fluctuation effect down to practically
nothing. Active termination uses only a 110-ohm resistor, which is
installed from the regulator to the signal line. This provides a much closer
match to the normal impedance of a SCSI cable. This closer match means
a more stable signal, which creates less signal reflection and thus fewer
data errors.
Force Perfect Termination (FPT)
Although FPT is not recognized in
any of the SCSI specifications, it is a Single Ended termination method
that uses diode switching and biasing to make up for any impedance mismatches that exist between the SCSI cabling and the peripheral device,
whatever it may be. Since FPT is not part of the specifications, it should
not come as a surprise that there are several types of FPT and these different types may not be totally compatible. Also, by and large you can
assume that FPT only works and plays well with FPT.
Low Voltage Differential (LVD)
The terminator for LVD uses a form
of Active termination. This termination enhances the faster speeds and
lower power consumption than HVD. It works with Ultra 2 and Ultra 3
SCSI.
LVD/MSE Finally, there is what is referred to as LVD/MSE. This is
LVD that makes use of multimode transceivers. In the case of LVD/MSE,
www.sybex.com
19
it checks the voltage level appearing on the DIFFSENSE pin of the cable.
By sensing the voltage level, the terminator knows to automatically
configure itself for LVD or for Single Ended. Most new SCSI designs
include these multimode transceivers.
So, now lets see how to put it into action. Take a look at Figure 1.5.
FIGURE 1.5
As you can see, the Host adapter and the last device in the chain are terminated. But what about those things called SCSI IDs?
www.sybex.com
20
Chapter 1
Disk Subsystems
Remember, SCSI IDs must be unique on the chain. You cannot have two
device 3s on the chain.
How do you choose which address to assign to which device? Lets look at an
example. To keep this simple, lets use an old 8-bit bus because then we dont
have so many numbers to work with. Suppose that we have our controller, three
hard drives, and a CD-ROM. If we are using a regular PC SCSI, we have an ID
range of from 0 to 7. Remember, we are geeks, and all geeks start counting at 0.
In this case, the controller would be set to ID 7, because the higher the number,
the higher the priority. As far as all the rest of the devices, it really doesnt matter
as long as the IDs are unique. For simplicity, we would probably address the
hard drives as 0, 1, and 2 and make the CD-ROM device 3. In this chain, there
would be a terminator on the controller and a terminator on the CD-ROM.
Usually, set the slowest device with the highest number, which will give it the
highest priority. Also, start your numbering at 0. When you boot your system,
the SCSI controller will attempt to contact each device in the chain, starting at
the lowest number. If you have numbered everything from 6 down, you are
going to spend a lot of time waiting for the controller to decide that devices 0
and 1 are not on the chain!
So, you know how to identify the device on the SCSI chain by giving it an
address. What if the device performs different functions, and there needs to
be a way to make that happen? That is where the Logical Unit Number
(LUN) comes into play. The LUN is a value that is used to identify a logical
unit of a SCSI device. According to the SCSI-2 specifications, there can be up
to eight logical units for each SCSI device address. These logical units are
numbered from 0 to 7. To give an example of how this might be used, think
of a tape drive that has a tape changer. In that case, the entire assembly may
have a SCSI ID of 0. The tape drive may have a LUN of 0 and the changer
may have a LUN of 1. Therefore, the actual SCSI address of the tape drive
would be ID 0, LUN 0.
www.sybex.com
21
Bus Length
How long can the SCSI chain grow? As I mentioned earlier, it depends on the
level of SCSI you are using. Check out Table 1.3. It should give you a good
idea of the numbers to keep in mind when working with SCSI.
TABLE 1.3
Max
Bus
Length
in
Meters
(HVD)
Maximum
Number of
Devices
Supported
Bus Speed
MB/Sec
Maximum
Bus
Width
in Bits
Max
Bus
Length
in
Meters
(SE)
Narrow
SCSI-1
N/A
25
Narrow
Fast
10
N/A
25
Fast
Wide
20
16
N/A
25
16
Narrow
Ultra
20
1.5
N/A
25
Wide
Ultra
40
16
N/A
N/A
25
16
Wide
Ultra
40
16
N/A
N/A
Narrow
Ultra 2
40
N/A
12
25
Wide
Ultra 2
80
16
N/A
12
25
16
Ultra160
160
16
N/A
12
N/A
16
Ultra320
320
16
N/A
12
N/A
16
SCSI
Type
www.sybex.com
22
Chapter 1
Disk Subsystems
RAID
nother way that SCSI plays an important part for your server is in the
way it can be used to provide redundancy and increased performance. Much
of that is done with a technology called Redundant Array of Independent
Disks (RAID).
www.sybex.com
RAID
23
In every book I have ever written for Sybex, I have mentioned the gee-whiz factor of computers and of networking. The gee-whiz factor works like this: I understand how it works, I know why it works, I even know how to make it work, but
the fact that it works the way it does still amazes me. Now, I admit, I am easily
amused. But when I think about routing, I am truly amazed. When I think about
the elegant simplicity of Domain Name Service (DNS), I am amazed. But I am
really in awe of RAID technology. The upper levels of RAID are seriously impressive. So, this is what we are going to talk about in this section.
Definition of Terms
I know, I know, there is a whole glossary in the back of the book dedicated
to defining terms. I also know that there are terms that I am going to be
throwing out over the next several pages that we should come to some sort
of a common understanding about. Not that you would fail to take the time
to look in the back of the book to see what they meanthat would never
happen.
So, when we start talking about things like RAID, we start using terms
like high availability or fault tolerance. High availability means just what it
sounds like, making sure that the resources your server provides are available a high percentage of the time. Fault tolerance means that if something
breaks, there is something else there to pick up for the broken part, and
things go on as if nothing happened.
Another term we should look at is the phrase single point of failure. A
friend of mine says that you can tell the skill level of a network administrator
by her level of paranoia. The really paranoid ones are the ones who have
been around the block and understand that the question is not if something
is going to go wrong, but when. They also know that no matter how bad they
think it can get, it can get worse. In any computer system, there are components that can go bad. The reliability factor is getting much better, but stuff
still does happen. You are trying to increase your odds, so that when things
do go bad, you are covered. You know that certain components in a system
have a higher chance of failure than others. For example, it is much more
likely that a printer is out of paper or is offline than that the mainboard in
the printer has gone bad. So, by looking at where our single point of failure
is, we are hedging our bets.
Here is a brief example. If I provide a level of disk drive protection called
mirroring, it means that two disks are hooked to a single controller. Everything that is written to one of the disks is written to the other disk. If one of
the disks goes bad, the other disk is there to take over for it, and we have
fault tolerance and higher availability. We have, in effect, moved our single
point of failure from the hard disk back to the disk controller. You can move
Copyright 2001 SYBEX, Inc., Alameda, CA
www.sybex.com
24
Chapter 1
Disk Subsystems
the single point of failure back farther than that, but that would be stealing
my thunder for the section on duplexing.
Next, there is the subject of data striping. With data striping, instead of
taking bits of information and storing it on disk, you are taking bits of information and storing it across several disks. In this way, write heads on several
disks are being utilized and performance increases dramatically. Unfortunately, there is no fault tolerance, so if any one of the disks in the stripe set
goes bad, the entire set is dead in the water. This is not necessarily a good
thing, so there is something called striping with parity.
Finally, we get to the subject of parity. What follows is a highly simplistic
explanation of parity, but it should give you an idea of how it works. First,
take a look at Figure 1.6.
FIGURE 1.6
Disk 1:
30GB
Disk 2:
30GB
Disk 3:
30GB
Disk 4:
30GB
Disk 5:
Parity storage
Assume that each one of the first four disks is 30GB each. The fifth 30GB
drive is not used to store datait is just used to store the mathematical sum
of the information striped across the first four drives.
So, now we are going to save a file called RESUME.DOC to the striped set of
drives with parity. In this case, lets assume that the first block of data can be represented in binary as 1010. That means that a 1 would be written to Drive 1, a
0 written to Drive 2, a 1 written to Drive 3, and a 0 written to Drive 4. Finally,
the sum of 1+0+1+0 or 10 (remember, we are dealing in binary here, and in
binary 2 is represented as 10) would be written to Drive 5. Parity is defined simply as the quality of sameness or being equivalent. With RAID we are using parity not only to check to make sure things are the same, but we are also using it
to rebuild things that may have been damaged. Take a look at Figure 1.7 to see
what I mean.
www.sybex.com
RAID
FIGURE 1.7
25
10
10
Because you have instituted parity, life can go on without anyone being
the wiser. If someone wants to access the file RESUME.DOC, the system recalls
the file and loads the 1+0+1. It knows there is supposed to be something in
place of Disk 4, but since it cant find it, it reads the parity sum of 10 and
knows that all that is missing is another 0. The system can continue functioning until you get another drive installed. Hopefully, there will not be any
more thunderstorms!
So now that we have terms defined, we can talk about RAID.
www.sybex.com
26
Chapter 1
Disk Subsystems
RAID 0
This is disk striping without parity. Of all the RAID technologies, RAID 0 is
the fastest because the write heads are constantly being used with duplicate
data being written, or without any parity being figured. With this system your
server will have multiple disks and the information is striped across the disks
in blocks without parity. There is no fault tolerance in a RAID 0 system.
RAID 1
This level is commonly referred to a disk mirroring, or disk duplexing. In either
case, there are two hard disks involved and anything that is written to one of the
hard disks is written to the other. In the case of disk mirroring, there is just one
disk controller, so the controller is the single point of failure. Disk duplexing
adds a second controller to the second disk, moving the single point of failure
away from the disk subsystem to the mainboard. In RAID 1, if either disk fails,
the other disk takes over. There is no parity or error checking information
stored. If both drives fail, new drives must be installed and data restored from
backup.
The disadvantage of RAID 1 is cost per megabyte. If you have two drives
that have a published capacity of 30GB each, and they are mirrored or
duplexed, the total amount of usable disk space is 30GB, not the 60GB you
purchased. If you are using different-sized drives, the mirror will reflect the
storage capacity of the smallest drive.
RAID 5
In this case, data and parity information is striped at block level across all of
the drives in the chain. Again, RAID 5 takes advantage of the faster disk
reads and writes. The parity information for data on one disk is stored with
data on another disk, so if any one of the disks fails, it can be replaced and
the data can be rebuilt from the parity data stored on the other drives. Again,
it requires a minimum of three drives, but usually five or more disks are used.
The disadvantage here is that the controllers are becoming expensive.
Reasons to Use RAID 5
When you start talking about RAID 5 and higher levels, the hardware
controller can become something of an issue. This can cause the price
point of the implementation to climb. Obviously, you are going to use it
on mission critical servers like these:
Database servers
www.sybex.com
RAID
Intranet servers
27
RAID 0+1
Now we start getting into some of the hybrid approaches. If you look at the name,
and you understand what RAID 0 and what RAID 1 do, you have a pretty good
idea of what RAID 0+1 is. You know that RAID 0 is disk striping without parity.
You know that RAID 1 is mirroring or duplexing of disks. So, RAID 0+1 is where
an entire stripe set without parity is actually mirrored or duplexed. There will be
a giant performance improvement on disk reads, there will be some performance
hits on disk writes. Data will survive the loss of multiple disks, but the monetary
cost can be high.
Software RAID
RAID 0 and RAID 1 are usually defined at the software level. In this case, it is
the server operating system that determines the RAID level and the level of protection. In Windows NT/2000 it can be called Drive Striping, Drive Striping
with Parity, or Mirroring. In NetWare it may be called Mirroring, but the result
is same. Somewhere there is a tool or utility that will allow you to either stripe
a drive and provide parity or mirror the drives.
The advantage of using Software RAID is low cost. There are no special
controllers to buy. The operating system will recognize the drives and provide
the level of protection that you define.
www.sybex.com
28
Chapter 1
Disk Subsystems
Hardware RAID
In some of the more complex implementations of RAID, a special controller or
special disks need to be linked together. When you start mentioning the word
special, the dollar signs usually start to light up. It will be up to the controller
to define the level and type of RAID.
Because you are dealing with hardware rather than software, your
performance will increase.
ou have hot swaps, hot plugs, and hot spares. What in the world is
the difference and how do they work? It is all a matter of degree!
Hot Spare
A drive is considered a hot spare if you happen to have an extra drive sitting on
the shelf that matches the type and configuration of the drives in your server. For
example, if you have a SCSI-2 Ultra, 9GB Seagate on the shelf waiting in case of
emergency, that would be considered a hot spare. When the hot spare gets put
into play, it could be a hot plug or a hot swappable drive.
Here is an example of how a hot spare would be used. Say your server has
RAID 1level mirroring defined. The first drive in the mirrored pair has
failed, for no other reason than drives go bad, and the system has failed over
to the second drive in the mirror. In this case, the system keeps on working
like nothing has happened. You notice the fail over (the fact the first drive
failed and the second took over) and make plans to replace the failed drive
with a hot spare when the server can be taken out of service with a minimum
amount of interruption to the normal workday. When you can down the
server, you shut it off, replace the failed drive with the hot spare, and bring
the server back online. Once the server is online, you can then use the appropriate tool to reestablish the mirror, and the new drive will be mirrored to
match the old.
www.sybex.com
29
Hot Plug
With a hot plug drive, the server does not have to be brought down or taken
out of service to install a new drive. In the case of a hot pluggable drive, you
are not replacing a current drive that has failedyou are adding disk space to
the mix. In the case of a hot pluggable drive, you open a cabinet, plug the drive
into the backplane of the cabinet, and the operating system should recognize
the drive is there. Depending on the operating system, you will have to create
a partition and a volume to make the drive available.
Hot Swap
This is one of those gee whiz things we talked about earlier in the chapter.
I remember the first time I had to hot swap a drive in a RAID array with parity.
When I asked a senior tech how to do it, he smiled and said, Open, pull, push,
watch, and be amazed. I got to the client site with the drive in hand, and went
to the server room. There was a large disk array of seven drives in a cabinet
with a glass door, all sorts of flashing lights next to six of the drives, and a
series of steady red lights next to the drive that had died. It didnt take a rocket
scientist to figure out which drive had failed. So I opened the glass door and
saw the two small rocker arms holding the bad drive in place. I moved those
out of the way, grabbed the handle on the front of the bad drive, pulled, and
the drive came out in my hand. I took the new drive, pushed it gently into the
slot until I felt it lock, and then put the rocker arms back in place. Once that
was done, the lights next to new drive started going crazy, while the drive was
automatically rebuilt from the other drives in the set.
It was seriously cool! No one on the network had any clue that a drive had
ever failed! No data was lost, no time was lost, and the server was never
unavailable.
www.sybex.com
30
Chapter 1
Disk Subsystems
We probably should have gone over this earlier, but we didnt! As you read
through many of these chapters, keep in mind that there are several laws of
network computing that come into play. Some of these are documented,
some are only figments of my imagination, but they are important to remember just the same. Here are some of my favorites: Williamss Law: You can tell
the skill level of a network administrator by his level of paranoia. The really
good ones are really paranoid. Murphys Law: Anything that can go wrong,
will go wrong, at the worst possible moment. Govanuss Law: The chance of
completing any network upgrade successfully is inversely proportional to the
visibility of the project and the proximity of your annual review. If you are
about to undertake a project that will affect everyone on your network, and it
is the night before your annual review, please be sure to be carrying a copy of
your resume on a disk in your pocket. You may not make it back to your desk.
Finally, my favorite, and this one has proven true worldwide: End users lie.
Network administrators are the best end users.
www.sybex.com
Fault Tolerance
31
Fault Tolerance
Fault tolerance is the act of protecting your computing gear, whether that
gear be infrastructure-oriented as in switches and routers or computer-oriented
as in servers and disk farms. In either case the fundamental question you ask
yourself is this: How can I protect the equipment so that a fault of some kind
doesnt interrupt service? Impairment of service might be tolerable: interruption
is not.
We talk about uptime of devices and services in terms of 9s. We assume
that you want to keep your gear up 99% of the timethats a given. But as
we add 9s to the other side of the decimal point, the time that a device is
allowed to be down, including maintenance windows, becomes increasingly
smaller. Five nines uptime equates to an allowance of about 4 hours
downtime per year, including maintenance windows. Your goal with fault
tolerance methodologies is to increase the number of 9s that are on the right
side of the decimal point. Five nines is optimal, but not realistic in most situationsfour nines is a better goal. Well talk about how to realize these
goals in this chapter section.
Configuring RAID
One thing you can do is use RAID to help augment your system uptime.
Either RAID 1 (disk mirroring) or RAID 5 (disk striping with parity) will be
beneficial to you in terms of bringing fault tolerance to your servers.
Some network operating system software allows you to set up RAID configurations without having to purchase special hardware RAID array controller cards (theyre expensive). I have worked with software RAID and dont
think it works very well. I much prefer hardware solutions. For starters the
card has its own processor and memory and can really go a long way in offloading the central CPU from having so much work to do. With software
RAID the CPU handles everything. Also, the software solution seems to not be
as reliable as the hardware-based solutionthough it may have been my fault
configuring the software rather than how the software behaved. Whatever the
reason, Im saying that when you consider RAID implementations, you should
pay the extra $2K or so and get the RAID controller card with the system.
Another important point about hardware RAID is that there is always
some data in the cards memory. If you had an ungraceful shutdown on the
server while there was some data in that card, it would be lost. Thus its
important to purchase your RAID array controller cards with a battery
www.sybex.com
32
Chapter 1
Disk Subsystems
backup so that in the event an instantaneous down happens to the server, the
data will be safe for a time until you can bring the server back up. Keep in
mind the data is being held there by a battery, so you dont have days or anything like that, but you do have some cushion you can work with.
Youll usually opt for either a mirroring or a striping-with-parity scenario
for a given set of drives. You can have both kinds in your system without
encountering any difficulty at all. As a general rule of thumb, you usually
want your OS to be on a mirrored set of disks while your data will live on a
RAID 5 volume. Some NOS software wont work on a RAID 5 volume at all.
Youll typically configure the RAID volumes through either a BIOS
interface at the cards boot time or through a configuration CD that
comes with the server. HP, for example, includes a wizard-like interface
that you can use to configure the entire box, including the RAID array.
Watch the BIOS messages at boot time and youll be given the key
sequence to enter so that you can access the cards BIOS.
Theres also the concept of a RAID 10 where you configure two separate
drive cages with RAID 5 arrays and then mirror the arrays. Youve got
double fault tolerance because if the first array has two drive failures, you
can break the mirror and work on the second drive array until you get the
first one fixed.
Realize that just because the systems on RAID doesnt necessarily mean
itll never have to be taken down. RAID helps safeguard systems so that they
can keep working until users go home and you have a chance to down the
computer and make repairs after hours. You want to avoid downing servers
during working hours.
www.sybex.com
Fault Tolerance
33
want to configure it. Keep this in mind as you add disks to the system. You
cannot add a hard drive that is smaller than the current array is expecting
and be able to configure a volume. You must provide as large a disk or larger
in order to facilitate the addition. If you have some left over, its up to you
to configure the extra space as you see fit.
Lastly, remember the n-1 rule with RAID 5. You take the number of disks
youre going to dedicate to the RAID 5 array and subtract 1 from that number
to account for the space needed for the parity stripe. Thus if you have six
17GB hard drives youre putting in an array, youll really only wind up with
5 * 17GBs worth of data because you sacrifice one disks worth of space for the
parity stripe. Actually the stripe is usually sent across all the disks so youre really
not dedicating one disk to parity stripe, though there are RAID implementations
that will allow you to do such a thing. With RAID 5 then, more disks means that
you attain more actual disk storage space and dont sacrifice as much space to
parity striping. More is more in the case of RAID 5.
www.sybex.com
34
Chapter 1
Disk Subsystems
www.sybex.com
Fault Tolerance
35
Planning for fault tolerance is all about redundancy. You should consider
applying redundancy in any of your mission-critical servers. A little bit of
money spent now can save countless hours of downtime later on.
www.sybex.com
36
Chapter 1
Disk Subsystems
Theres one other technique thats often useda hot spare. In a hot spare
situation, you keep a spare drive in the computers drive bay. You configure
the RAID array controller to treat the drive as a spare. When data is written
to the stripe, the drive is included as a backup drive. If one of the main drives
in the array fails, you can utilize the hot spare to act as a fallback. Hot spares
are handy because you simply have to go into the RAID configuration utility
and tell it to begin using the hot spare. The downside is that you burn a hard
drive you wouldnt ordinarily have to use.
All of the above systems provide high-availability scenarios in the case of
a single drive failure. Keep in mind that two or more drives failing means the
end of one array.
One thing that should be obvious from looking at the above list is that you
cannot make DR decisions alone. Clearly youll need to solicit the advice and
interaction of others in order to facilitate a robust DR plan.
www.sybex.com
Fault Tolerance
37
Remember the basic concepts of DR: fault tolerance, the ability to gracefully recover from a fault, and redundancy. If, for example, your business
runs entirely off of Web activity, then your Web servers are of paramount
importance to you. So much so that you cannot afford for them to go down.
In such a case, a DR plan might include the following components:
Clustered computers that can allow for the failure of any one server
Hopefully you get the idea. DR means that you provide an offsite place
where a redundant copy of your operation can live in case the first instance of
your operation somehow gets annihilated. Redundancy means that you build
fault tolerance into the feature set so that you avoid annoying little failures that
have the capability of driving the entire enterprise to its knees. You put this all
down in writing in a DR plan and then you periodically test the plan to make
sure it works for todays operations.
www.sybex.com
38
Chapter 1
Disk Subsystems
Summary
o, what we have done here is give you some protection against Murphys Law, and a way to prove that you match Williamss definition of a
really good network administrator. There is more to be done on this front,
but that about takes care of the disk subsystem. First we are going to do some
review, and then in Chapter 2, IDE Devices, we will be looking at clustering, Fibre Channels, CPUs, and multiprocessing.
We also talked about fault tolerance and all of its nuances. There is one
basic notion that comes into play when we think about fault tolerance:
redundancy. The acronym RAID, for example, stands for Redundant
Array of Inexpensive (or Independent) Drives. You use redundancy to
build in high-availability. A hot spare drive is one that sits in the drive cage
and can be put into play by tweaking the RAID utility. Hot swap capability
means you can change out a hard drive without any interruption to users.
Warm swap means you can change out the drive in an array, but you have
to disrupt I/O requests long enough to get the drive replaced. You avoid the
www.sybex.com
Summary
39
Exam Essentials
Know the difference between logical drive and physical drive A physical drive can contain multiple logical drives but a logical drive will usually reside on one physical drive.
Know the different levels of raid, and what makes each level unique
RAID 0 is disk striping without parity, RAID 1 is disk mirroring or
duplexing, RAID 5 has data and parity information striped at the block
level across the drives, and RAID 0+1 is where a disk array that has been
striped without parity is also duplexed or mirrored.
Know which levels of scsi can interoperate without an adapter and which
levels will require an adapter SCSI, SCSI-2, and Ultra SCSI all use a
50-pin connector that is interchangeable. Wide SCSI, Wide Ultra SCSI,
Ultra 2, and Ultra 160 use a 68-pin connector.
Know the appropriate length of the various scsi cables SCSI is 6
meters, Fast SCSI is 3 meters, and Ultra SCSI is 1.5 meters with more than
five devices. If there are less than five devices, then the cable can also be
3 meters in length.
Be comfortable with the differences between hot plug, hot spare, and hot
swap. A hot spare is a device that is waiting to be put into the machine.
The other two choices are very close in meaning: A hot pluggable device
is one that can be installed while the computer or server is turned on, and
a hot swappable device is one where the device can be removed and
replaced and the server will experience no loss of service. For example, a
single network card can be hot pluggable. Drives in a RAID 5 array can
be hot swappable. If one of the drives fails, it can be removed and
replaced, and the data can be rebuilt on the fly without any loss of service.
Know how to configure drives. Be able to add or change drives in an
array and configure accordingly.
www.sybex.com
40
Chapter 1
Disk Subsystems
Key Terms
Before you take the exam, be certain you are familiar with the following terms:
Active
American National Standards Institute (ANSI)
asynchronous mode
data striping
disk duplexing
Domain Name Service (DNS)
Fast SCSI
Fast-Wide SCSI
fault tolerance
Force Perfect Termination (FPT)
high availability
High Voltage Differential (HVD)
hot spare
jumpers
Logical Drive
Low Voltage Differential (LVD)
LVD/MSE
Mega Transfer (MT)
mirroring
parity
Passive
Physical Drive
RAID 0
RAID 0+1
RAID 1
RAID 5
www.sybex.com
Summary
www.sybex.com
41
42
Chapter 1
Disk Subsystems
Review Questions
1. What is the width of the data transfer bus of SCSI-1?
A. A nibble
B. 4 bytes
C. 8 bytes
D. 8 bits
E. 16 bits
2. SCSI-1 is also referred to as which of the following?
A. Narrow
B. Slow
C. Fast
D. Wide
E. Ultra
F. Ultra 2
G. Narrow, Fast, and Wide
3. Choose one type of connector used in SCSI-1.
A. 9-pin serial
B. 15-pin serial
C. 25-pin Centronics
D. 50-pin Centronics
4. RAID stands for which of the following:
A. Redoubtful Array of Inexpensive Diskettes
B. Redundant Array of Inexpensive Disks
C. Redundant Array of Independent Disks
D. A SWAT team action
www.sybex.com
Review Questions
43
hard drive in one of the Web servers fails. The server is running hardware
SCSI-based RAID. What kind of drive changeout can Horace most likely
perform?
A. Cold swap
B. Warm swap
C. Hot swap
D. Hot spare
7. Ultra and Ultra 2 are examples of which of the following:
A. RAID 10
B. SCSI Bus Width
C. Physical Drives
D. SCSI Bus Speed
8. With LVD SCSI, how many wires will be dedicated to carrying the signal
www.sybex.com
44
Chapter 1
Disk Subsystems
your server. You still need to keep a device that works with normal
SCSI. Is it possible to run the SCSI device from the SCSI-2 controller?
A. No, SCSI-2 is not backwardly compatible to SCSI.
B. No, SCSI is Single Ended, and all SCSI-2 is HVD.
C. Yes, SCSI-2 is backwardly compatible with SCSI.
D. Check proper termination.
11. What is the maximum number of devices that can be part of a SCSI-3 bus?
A. 16 devices
B. 8 devices
C. 7 devices
D. 14 devices
E. 15 devices
12. What is another name for SCSI Ultra?
A. SCSI Wide
B. SCSI Fast and Wide
C. SCSI 20
D. SCSI 40
E. SCSI Fast 20
www.sybex.com
Review Questions
45
wires to carry it. One wire will carry the signal and the other wire
will carry a defining voltage.
C. For any signal that is going to be sent across the bus, there are two
wires to carry it. One wire will carry the signal and the other will
be ground.
D. For any signal that is going to be sent across the bus, there are two
www.sybex.com
46
Chapter 1
Disk Subsystems
17. You have a SCSI controller in your server and now you wish to add
with mirroring?
A. RAID 1+5
B. Hybrid RAID 5+
C. High Performance RAID
D. RAID 0+1
19. Which of the following SCSI standards can use a 68-pin connector and
www.sybex.com
Review Questions
47
20. Your boss has asked you to implement Hardware Level RAID because
performance.
B. Software Level RAID is less expensive but provides better performance
www.sybex.com
48
Chapter 1
Disk Subsystems
swap. If Horace has an extra drive sitting around (one that has the slide
rails used for his computers drive cage), all he has to do is pop the old
drive out, put the new one in, and the RAID controller should automatically take over. Some older controllers require you to manually begin
the array rebuild.
7. D. Ultra and Ultra 2 are examples of Bus Speed.
8. C. With Low Voltage Differential, any signal that is going to be sent
wires to carry it. One wire will carry the signal the other wire will be
attached to ground.
14. D. With RAID 3, the data is striped in bytes.
www.sybex.com
49
15. E. RAID 5 data is striped at block level across all of the drives in the
chain.
16. A. Whether a drive is hot swappable has nothing to do with its status
as a hot spare.
17. B. SCSI-2 was backward compatible with SCSI, but for maximum
benefit, it was suggested that you stick with one technology or the
other, preferably using a SCSI-2 controller with SCSI-2 devices.
SCSI, SCSI-2, and Ultra SCSI all use a 50-pin connector that is
interchangeable.
18. D. RAID 0+1 is a hybrid approach where an entire stripe set without
68-pin connector.
20. A, C. Hardware RAID costs more because of the special controller
www.sybex.com
Chapter
IDE Devices
SERVER+ EXAM OBJECTIVES COVERED IN
THIS CHAPTER:
3.3 Add hard drives.
www.sybex.com
t wasnt that long ago that SCSI drives were selling for about
$1,000 a gigabyte and memory was selling for $100 a megabyte. Maybe it
wasnt that long ago by the calendar, but in computer terms, it was eons
ago. In the late 1980s, Integrated Drive Electronics (IDE) drives were introduced as a lower-price-point alternative to SCSI drives, or some of the
other high-priced, low-performance alternatives. Since the 80s IDE drives
have come a long way, to the point where they are being shipped in an estimated 90% of all systems sold. Lets take an in-depth look at the history
of IDE drives, how they have overcome some of the barrier limitations
imposed by the technology, and where IDE technology is today. We can
then take a look at the types of cabling and connectors required to install
IDE devices, and some of the differences between IDE and SCSI.
For complete coverage of objective 3.3, please also see Chapter 1. For complete
coverage of objective 3.6, please also see Chapters 1 and 10.
s you remember from taking the A+ exam, disk subsystems are made
up of the hard disk, the cabling, and the disk controller. Last chapter, in our
discussion of SCSI, you saw how the controller had to be matched to the type
of SCSI technology, and how the controller played an active part in moving
data and instructions. Disk controllers can be integrated into the mainboard,
or they can be on a board that plugs directly into the mainboard. Sometimes
these are called controllers, but you may also see the terms paddle cards or
even paddleboards.
www.sybex.com
53
Like many of the computer industry acronyms, IDE has picked up several
definitions. Depending on the book you read, it may be Integrated Device
Electronics, or it may be referred to as Integrated Drive Electronics. This falls
under the tomato (toe-may-toe)/tomato (toe-mah-toe) argument; it really
doesnt matter much where the acronym came from as long as you know
what it is referring to and the darn things work.
Since the drive was controlled by electronics on the drive, the drive
manufacturers could encourage enhancements, because there were no
pesky controller compatibility issues to contend with. Each manufacturer
was free to include some new techniques that would increase capacity,
speed, and the average time that the drive could operate without failure,
called the Mean Time Between Failure (MTBF). Some of these advances
included error checking, or the ability to automatically move contents
from blocks that were failing to blocks that were specifically set aside for
the purpose, or generating higher disk rotation speeds to ensure faster
data access, and even giving the user the opportunity to re-map the drive
geometry if desired. Lets take a look at the history of IDE to track where
it has been until today.
IDE hard drives, but those are by no means the only devices that take advantage of IDE technology. There are also things like IDE tape devices and IDE CDROMs.
www.sybex.com
54
Chapter 2
IDE Devices
As we mentioned above, IDE was originally designed so the disk controller was integrated into the drive itself. This meant that the drive no longer
had to rely on a stand-alone controller board for instructions as all of the
other types of drives did. This integration brought the cost down. It also
made the drives firmware implementations easier to manage for the manufacturer. This meant you had a device that didnt cost very much, and was
exceptionally easy to install. People loved it and the boom of the disk drive
industry was on.
ATA History
When ATA was introduced in the late 80s, it was a hard drive only type of
technology. At the time the ATA standard was approved, applications and
operating systems came on diskettes and only the real computer aficionado
had a CD-ROM device. Most CD-ROMs at the time were SCSI based and
expensive. Since there werent many, if any, things being distributed on CDs,
the CD-ROM was not a necessity for most folks.
As applications and operating systems grew, diskette distribution
became unwieldy, not to mention expensive. There had to be a better
way, and that better way was to distribute software on CDs. After all,
CDs were very inexpensive and could hold almost 700MB of data. CDs
were relatively impervious to end users. For an end user to do something
to damage a CD, they had to work pretty hard.
Now it became imperative that a reliable, low cost method be made available to distribute CD-ROM drives to the masses. The designers of the ATA
specifications suddenly needed to come up with a way to attach things like the
CD and the various tape drives or other storage devices on the existing disk
subsystem. Using the same ATA controller card to manage two devices would
be infinitely more viable than having to put yet another controller card in an
already crowded computer bus. So, the designers came up with something
called the ATA Packet Interface (ATAPI). ATAPI is a fancy name for an extension of the ATA interface. The extension is designed to allow several other
types of devices to plug into an everyday, ordinary old ATA 40-pin cable.
There are some differences in the way ATA supports hard drives and the way
it supports other devices. The hard drives receive support through the system
BIOS. It is up to the BIOS to define the geometry of the drive. These other devices
required a special device driver to support them. So, for example, if you had
installed an early version of the SuperWhizBang 8 X CD-ROM, you would originally need a driver from SuperWhizBang so the system would recognize the fact
that the drive was there. Back in the old DOS days, this required editing the
AUTOEXEC.BAT and CONFIG.SYS files to make sure everything worked just the
www.sybex.com
55
way it was supposed to. Depending on the operating system you are using, there
may still need to be some manual configuration of devices.
The standards continued to mature, and CD-ROM manufacturers started
working together to provide support for ATAPI. As ATAPI drives became
more standardized, operating systems, and in many cases the BIOS, were
able to recognize the CD-ROM. If the O/S or the BIOS could recognize the
drive, it could immediately load the driver, and if the BIOS can recognize the
CD-ROM, the CD can even be used as a bootable device. This eventually led
to some new advances that we take for granted today, with things like CDROMs that will autorun programs to start installations.
Back to the good-ol-days. When CD-ROMs became viable, it brought up
another shortcoming of the early ATA standard: That was the number of
devices you could have in an ATA chain. With the early drives, you could have
a maximum of two drives connected to a paddleboard and there could only be
one paddleboard in the computer. As you will see, the later implementations
of the standard increased the number of ATA channels in any machine to two,
so you can now have up to four ATA devices in a system. We will discuss how
to configure those four devices a little later in the chapter, in the section on
master/slave/cable select.
Rate
3.3 MBytes/sec
5.2 MBytes/sec
8.3 MBytes/sec
www.sybex.com
56
Chapter 2
IDE Devices
TABLE 2.1
Rate
11.1 MBytes/sec
16.6 MBytes/sec
As the need for speed increased, the PIO standard couldnt keep pace. That
was when DMA came into being. Instead of the device sending information
through the processor, now the information was written directly to memory.
Because the information is written directly to memory, the Central Processing
Unit (CPU) doesnt have to do anything with it, so the overall performance of
the computer is increased. DMA and Ultra DMA can increase processing
speeds to 100 MBytes/second, but we are getting ahead of ourselves.
Back to ATA 2. In addition to the different methods of handling data,
there were many other under-the-hood kinds of things that the average user
probably wouldnt be aware of. These included things like some powerful
drive commands, like the Identify Drive command. This command was a
godsend to technicians everywhere. Prior to the standardization of the
Identify Drive command, the technician who installed the drive had to
know some exact information on the way the drive was configured. That
usually wasnt a problem if it was the original installation of the drive and
you had all the documentation right there, but if the drive were ever
moved, or pulled out of one machine to be used in another, the configuration information tended to get lost. (Not that something like that would
ever happen to menope, never happen, because I always write the drive
specifications on the outside of the drive with a permanent marker. And if
you believe that, let me know, I have a great bridge to sell you just outside
of McCausland, Iowa.) Then you had to search for the documentation in
your exceptional filing system or call the manufacturer.
Anyway, that problem went away with the updated drives and updated
BIOS. If the drive was an ATA 2 device, when the drive was installed in a
computer, you simply had to install the drive and turn the computer on.
The BIOS would go out and discover the drive automatically. The drive
tells the BIOS how it is built and then the BIOS makes sure the rest of the
computer knows how to address the drive and how much viable space there
is. It is a wonderful thing. It is really one of the first instances of Plug and
Play, only this installation happened well before the operating system even
started to load.
www.sybex.com
57
Another advance was the way the drives handled the data transfer. Instead
of moving the information bit by bit, or even byte by byte, ATA 2 began to
allow block data transfers, called block transfer mode. Think of it this way:
Imagine you have just gotten back from the grocery store after buying one
months worth of groceries for a family of four. Further imagine that all you
could carry into the house was one item at a time. That would take you a really
long time to get everything into the house. That is the way it was before block
transfer mode came into play. Now, with block transfer mode, compare how
much more efficient it is to carry the groceries in one or two sacks at a time.
It may still take you a while to move all the stuff into the house, but not as long
as the other way. Block transfer mode just moved more information in a single
operation.
These block transfers were made possible by a new way of defining and
addressing the sectors on the hard drive. This was done using a process called
Logical Block Addressing (LBA). LBA had an additional benefit, because it
managed to overcome the early IDE size limit of 528MB.
ATA 2 maintained its backward compatibility with ATA drives. It used
the same 40-pin physical connector used by ATA, and an ATA 2 drive could
be used in conjunction with an ATA drive.
There are some other ways that ATA 2 may be described. For example,
you will hear terms like Enhanced IDE (EIDE) or Fast-ATA. Each of these
is not a standard, but just a different implementation of the ATA 2 standard.
EIDE, which started out as a particular manufacturers implementation, has
become so popular that EIDE has become more or less a generic term.
ATA 2 also introduced the capability of having two channels of two
devices per paddleboard. This meant that the total number of IDE devices
that were possible in a system had climbed to four. The channels were
referred to as the primary channel and the secondary channel.
ATA 3
The next standard is ATA 3. ATA 3 does not do anything for the faster transfer modes, but it does provide for password-based security and better power
management. It also has a technology called Self-Monitoring Analysis and
Report Technology (SMART). SMART will tell you when a drive is going
bad before it exhibits any symptoms that you may be aware of.
If you sometimes wonder why your computer takes a long time to respond
after you have let it sit for a while, ATA 2 may be part of the reason. You see,
it also added some sophisticated power management features that would put
the drive to sleep after it hadnt had anything to do for a while. ATA 3 is also
backwardly compatible with ATA 2, ATAPI, and ATA devices. You may also
www.sybex.com
58
Chapter 2
IDE Devices
see the term EIDE applied to ATA 3 devices, since there has been no significant
improvement in data transfer.
The system has excessive signal noise caused by multiple drives, a dual
power supply, or even an integrated Cathode Ray Tube (CRT).
The system has been put in overclocking mode, or has been set beyond
the manufacturers specifications.
ATA 66
Well, if ATA 33 moved data at 33 MBytes/second DMA, you will never guess
what rate ATA 66 moves data. You got it. It uses even faster high-performance
bus mastering for a 66 MBytes/seconds DMA data transfer rate. This can also
be called Ultra DMA-66 or just UDMA-66.
If you are going to install an ATA 66 drive, you will need the appropriate
drive, controller, and BIOS. Again, it is fully backwardly compatible with the
previous ATA standards, but the cabling has changed. The change was necessary because the transfer rates became so high that there needed to be more
protection against things like crosstalk and electromagnetic interference
www.sybex.com
59
Make sure you have the right cable. You can tell you are using a
40-pin/80-conductor cable because it will have a black connector
on one end and a blue connector on the other end, with a gray
connector in the middle. The blue connector goes to the motherboard, the gray connector is for the slave device, and the black
connector is for the master drive. In addition, the cable has something you probably wont be able to see: Pin 34 should be notched
or cut. The reason will become plain in the next bullet.
The motherboard or mainboard controller must be capable of supporting the ATA 66 standard. A compatible controller has a detect circuit that can recognize the fact that line 34 is not present on the cable.
If the detect circuit is missing, the motherboard may be able to detect
the presence of an ATA 66 cable, but may try to configure the device
for a higher transfer rate.
Some controllers may not be able to handle the ATA 66 on both the
primary and secondary channels. If you are having problems installing
the device on the secondary controller channel, you may want to move
it to the primary channel and see if that solves the problem.
Make sure you have the right controller card driver. Make sure the
BIOS is upgraded, and any patches that need to be applied to the
motherboard have been taken care of.
Be sure you are using a DMA-capable operating system and that the
DMA mode has been activated.
Make sure the drive has been configured to run at ATA 66 transfer
rates. Some drives ship with the higher transfer rate disabled by
default; enabling the higher transfer rate is done with either a jumper
switch or with a software setting.
www.sybex.com
60
Chapter 2
IDE Devices
ATA 100
The most recent advance in the world of ATA/IDE is the release of the ATA
100 interface. As you can tell from the name, the ATA 100 specifications
allow for the transfer of data at a rate of 100 MBytes/second. This is the
transfer from the host-to-drive bus. The new interface does maintain some of
its history, using the same 40-pin, 80-conductor cable as the ATA 66. This
means that like all the other devices we have talked about so far, the ATA
100 cable can be used with other, slower drives. These can include things like
hard disks, removable media disks, CD-ROM drives, CD-R/RW drives,
ATA tape drives, and DVD-ROM drives.
There are other advances made with ATA 100. One is something that has
been around the computer world for a while: Cyclic Redundancy Check
(CRC). The CRC is a very-high-level method of checking to make sure the
transferred data actually made it through the transfer process without
becoming corrupted. It is just a data reliability check.
It works like this. When the device that is transferring the data gets ready to
send it, it attaches an extra set of bits to every frame of data. These extra bits are
called the Frame Check Sequence (FCS), which acts as a type of verification that
is attached to each frame. When the frame is received, the receiver does the math
and checks to make sure the answer is what it expects. If it does, all is good. If
it doesnt, the frame has been corrupted and it needs to be retransmitted.
Lets look at a really simple example. Remember when you were kids and
had those really cheesy secret decoder rings that came in cereal boxes? That
way you could send messages to your friends, and if the teacher intercepted
them, she couldnt read them out loud. Well, the basis of that was usually
some kind of mathematical formula. We will assume that the sender is going
to multiply everything by 3 and that the receiver knows that. So, we take a
look at a simple four-bit frame:
1101
Now, since we are working with a frame, that is a binary number, not
a decimal number, so 1101 translated from binary to decimal is 13. Since
we agreed we are going to multiply everything by 3, our 13 becomes 39.
Converting that to binary, we have this result:
100111
Now, we are going to make another assumption, and we are going to
assume that our packet is made up of two parts; the first contains the answer,
and the second part contains a sequence of three sets of 10 and then the data.
www.sybex.com
61
www.sybex.com
62
Chapter 2
IDE Devices
This group is shooting for a new interface that will increase throughput to at
least 160 MBytes/second, with later versions reaching 528 MBytes/second. In
order to do this, the cable design is going to have to be radically altered. Instead
of the current 40/80 cable that allows for only four attachments, the new cable
will be much smaller, with only four signal pins and a few more pins for power
and electrical ground. What is this going to do to the current technology?
According to the Frequently Asked Questions (FAQ) at the Serial ATA
Working Groups Web site (www.serialata.org), the new implementation is
going to be designed so that it will drop into a PC and be compatible with the
software, meaning it will run without modification to your current computer
(other than the appropriate controller and devices). Since the cable will be
smaller, they will be easier to route and easier to install.
What about all your old stuff? It is anticipated that there will be a period
where both the old parallel standard and the new serial standard are available.
Now, this could cause a problem. Since both types of devices are going to show
up in the same machine, and since each will have its own interface, the Serial
ATA group expects that there are going to be some adapters to adapt the serial
cable to be able to handle the old 40/80 devices.
Serial ATA is going to support all the normal ATA and ATAPI devices,
including CDs, DVDs, tape devices, high capacity removable devices, and
Zip drives. One of the other goals is to make the devices easier to upgrade,
because the Serial ATA group is planning on eliminating jumper settings for
defining the devices role.
For more information on jumper settings and drive roles, see Master/Slave/
Cable Select and Jumper Settings later in this chapter.
www.sybex.com
63
Now if you look closely, you can notice several things. The first is the
thickness of the conductor channel. With the 40-pin connector, the channel
is much thicker than it is with the 40/80 connector.
FIGURE 2.1
ATA/33 cable
40 conductors
ATA/66 cable
80 conductors
The next thing I would like you to notice is that there is a dark line down
the right side of each cable. This line indicates the location of Pin 1. That will
become important in just a few seconds. Each of these cables, although you
cannot see it, has two other similar connectors on it. One of those connectors
would attach to the controller and the second connector would attach to
another ATA device.
You may be asking yourself why there is only one extra connector. Remember
that with IDE, unlike SCSI, there can only be two devices in a chain; SCSI can
have seven. Depending on the IDE controller, there can be up to two IDE chains
in any computer, for a total of four devices. Also, SCSI can handle external
devices, while ATA cannot.
Lets talk about installation. First of all, take all the usual precautions.
Turn the computer off, and unplug it. Always work with an antistatic mat
and an antistatic wrist strap. The antistatic mat is made of a conductive
material that is set on the top of your worktable, and then the computer or
other component is set on the mat. When the antistatic wrist strap is fastened
to the mat, the electrostatic charge level of anything placed on the mat will
www.sybex.com
64
Chapter 2
IDE Devices
become equalized with the charge level of the mat, and these will become
equalized with the charge level of your body. After the charges have been
equalized, electrostatic discharge (ESD) sparks will not occur. Now, lets
assume that you are installing a device that has the controller for both channels built right into the motherboard. We will also assume that you have
already mounted the device in the case. The first thing you have to do is
attach the cabling to the motherboard. Remember that colored stripethis
is where it comes into play. Since we are going to be adding another device
to the system (yes, this is another assumption), you locate the connector for
the second IDE channel. It should be marked on the motherboard with something really creative like IDE-2. Then, looking very carefully at the motherboard, you will see a small 1 near the end of one of the connectors. That
shows you where Pin 1 is on the motherboard. Now, Pin 1 on the motherboard has to match Pin 1 on the cable or things just will not work. Once you
have located Pin 1 on the motherboard, carefully line up the holes on the
connector with the pins on the motherboard, keeping the 1s together. Push
down gently until the cable is snug to the motherboard.
Here are a couple of tips. First off all, be careful to make sure that the pins are
all lined up with the holes on the connector before pressing down too hard. If
you should happen to bend or break one of the pins, it will probably ruin your
day. That is especially true if the controller is embedded in the motherboard.
That would mean replacing the motherboard, usually an expensive proposition. Secondly, once the cable is firmly attached to the mainboard, take a permanent marker and mark the channel in big bold numbers, so the next time
you have to add something to the IDE chain, you can immediately know which
channel you are dealing with. The channel information is silkscreened on the
motherboard, but I usually need a flashlight and a magnifying glass to read it.
This way is just simpler.
Once the cable is attached to the motherboard, you can attach the cable
to the device you are going to install. Again, check to find Pin 1. If you cant
find Pin 1, look closely at the male connectors on the drive. There will usually be either a space without a pin, or there will be a notch in the plastic connector sleeve. Check the end of the cable, and you may see one of the
pinholes blocked, and you may also see a notch on the cable. Line those up,
plug the cable in, and seat it firmly. Plug the power from the power supply
into the device and you should be ready to power up the computer. The
ATA-66 cable is keyed. Remember the blue keyed end attaches to the
motherboard. If the standard ATA cable is installed in Reverse Pin 1 to Pin
40, the hard drive LED will stay on continuously.
Copyright 2001 SYBEX, Inc., Alameda, CA
www.sybex.com
65
How do you know if it is there? Well, depending on the computer you are
using, watch what happens when the system boots. Some BIOS implementations will show you the devices they find as they go through Power On Self
Test (POST). Otherwise, you may have to access the BIOS to see if the device
has been recognized, or, depending on the operating system, the new device
may be visible through something like Windows Explorer.
ll through this chapter, we have been mentioning the fact that there
can be two devices, and only two devices, in an ATA subsystem. When you
start examining the advantages and disadvantages of IDE versus SCSI, that
is just one of the areas where IDE falls short. The other area is in the way the
two devices are linked together.
As you know by now, IDE stands for Integrated Drive Electronics. All of the
drives intelligence is on board every single drive. That is a great thing if you
have only one drive on the subsystem, but when there are two drives hooked
together and both want to be the brains of the operation, things dont work well.
With IDE devices, you have to relegate one of the drives from being the brains
of the operation to being the go-fer. It is called designating one of the drives to
being the master, and one of the drives to be the slave.
So, there are two ways a single channel of IDE components can be strung
together. Take a look at Figure 2.2.
www.sybex.com
66
Chapter 2
IDE Devices
FIGURE 2.2
www.sybex.com
67
As you can see, in the top part of the diagram there is a single drive
attached to the IDE host adapter. In the bottom part of the drawing, there
are two devices taking orders from the same host adapter.
Defining the master and the slave is done with jumpers. Now, there are
three possible settings:
Jumper Settings
Figure 2.3 shows what the business end of an ATA device looks like. If you
look closely at the picture, you will see that there are three sets of pins circled.
FIGURE 2.3
You will also notice that there is a small piece of plastic covering two of the
pins. This very small but very powerful tool is called a jumper. You see, each
set of pins represents a channel that the information signal can take from the
controller to the electronics on the drive. The presence or lack thereof of the
jumper completes a circuit that defines the path the electrical impulses will
take. For example, if there were no jumpers present, the information would
follow the path so the drive would be configured as a master, with no slave
device present. Look closely at Figure 2.4 to see the different types of settings.
www.sybex.com
68
Chapter 2
IDE Devices
FIGURE 2.4
In Figure 2.4, the master slave selection switch is designated as J8. If there
were no jumpers present, the drive would be configured as a master, with no
slave device present. Having a jumper covering Pins 3 and 4 of Switch J8 may
designate the drive as being the master with a slave present. The other drive in
the chain would then have to have a jumper covering Pins 5 and 6, indicating
that it would be the slave device, taking all of its instructions from the master.
Now it would be a wonderful thing if I could tell you that each set of pins
for the master/slave relationship was labeled J8, and that in each and every
case, no jumpers indicated master, jumpers across 1 and 2 indicated master
with slave present, and jumpers across 3 and 4 indicated slave with master
present. It would be a wonderful thing, but it would not be the real world.
Now, while it is generally true that the absence of a jumper usually indicates
a master with no slave present, in the real world, things just may not be what
they seem. When you are configuring any ATA devices, be sure to check the
Copyright 2001 SYBEX, Inc., Alameda, CA
www.sybex.com
69
appropriate documentation. If you cant find it, check the Web. Be sure to
check the documentation before you start removing jumpers and putting
jumpers back on again. Trust me, it will make your whole life a lot easier.
If you have installed multiple ATA devices, and one of them is recognized by the
system and the other isnt, or if neither of them is recognized by the system, shut
the machine off and start over. Your jumpers are in the wrong place. If you get
things really flummoxed, you may want to go back to the beginningin other
words, install the first device as a master with no slave. Check to make sure it is
recognized. Remove the first device, and install the second device as a master
with no slave; check to make sure that it is recognized. Once that has been done,
you know both your devices are good. Then configure one as the master and the
other as the slave and install them. Check to make sure they are both recognized.
If not, check to make sure the cable is tight. If the cable is tight and one (or both)
of the devices is still not being recognized, and you are absolutely positively certain the jumpers are 100% correct, replace the cable. Make sure you replace the
cable with the right type cable for the most advanced type of ATA device in the
chain. In other words, if you have an ATA 66 device in the chain, you should be
using a 40/80 cable. In this scenario, the potential problem areas are the jumper
settings and the cable. As a last resort, replace the jumpers. Sometimes they lose
the metal sleeve that covers one of the pins, and contact is not made.
Cable Select
Now, if you have been really sharp, you will have noticed that there were three
sets of pins shown in Figure 2.3, and up until now, only two sets of those pins
had a reason to be jumpered. The third set is for cable select (CSEL), which
does just what it says, it lets the controller decide which drive will be the master
and which drive will be the slave. CSEL is one of those features of the ATA
specifications that have been around for a while, but you may have never had
an opportunity to work with it. There were some problems with the original
specifications. Look at Figure 2.5, which shows that CSEL has one drive
added, and it has been assigned the drive letter C:.
www.sybex.com
70
Chapter 2
IDE Devices
FIGURE 2.5
Drive 1
C:
Look at Figure 2.6 to see what happens when you add a second drive.
FIGURE 2.6
Drive 0
C:
Drive 1
D:
You will see that the cable select has automatically assigned the letter D: to
the second drive. Now, if you have completely used all of the space on the first
drive for C: and the second device is a CD-ROM, all is fine. What happens if
you havent? Say you have created a drive and assigned the letter D:. Now you
have chaos.
Some controller manufacturers have done serious work to solve the problem,
but for the most part, especially in a server implementation, you may want to be
really sure and just take matters into your own hands and configure the settings
yourself.
www.sybex.com
71
www.sybex.com
72
Chapter 2
IDE Devices
If you are installing new devices into a computer, and things are not working
as planned, the first thing to check is the configuration of master and slave.
After that, check to make sure the cables are plugged in properly.
Oh, yeah, one other thing. Dont be like me. I tried for about 15 minutes to
get an IDE CD-ROM to be recognized by the system before I noticed that
while the master/slave was right, and the cable was in the right way, having
a power cord connected should have also been a priority. There are times
we all do really dumb stuff, and I think I hold the record!
The term AT Attachment (ATA) is synonymous with IDE. ATA has gone
through several version iterationsmostly due to increased computer bus
speeds. Visit www.webopedia.com and perform a search on the keyword ATA
for more information.
The easiest way to tell the two apart is to simply look at the connector
cables for each. IDE uses a 40-pin connector and SCSI uses anywhere from
Centronics and DB25 (for SCSI I) to a 50-pin connector for SCSI II and a 68pin connector for SCSI III. Theres no mistaking an IDE cable for a SCSI
cable. So, when in doubt, even if the hard drive doesnt have a label or a cable
www.sybex.com
73
you can count the number of pins it will accept and youll know what kind
of drive youre dealing with.
Note that its possible to mix IDE hard drives with SCSI drives in a system.
Normally I dont like to do that because it can be very confusing to try to figure
things out. Simple is better. But keep in mind that drive mixing can be done.
Another interesting thing that you might get into, though its not as common today, is the need to know the number of cylinders and heads that an IDE
hard drive comes with. In computers with an older system BIOS, the computer
didnt recognize the IDE hard drive until you keyed in the number of cylinders
and heads the drive was using. Then the BIOS would (most times) recognize
the drive configuration and bless it as usable. Today the system BIOS autodetects the hard drives heads and cylinders and you dont have to go through
that rigmarole. The problem with the cylinders/heads scenario is that some
hard drives didnt come with that information stamped on them! You had to
go to a book or get on the Web (or on a BBS in the old days) to download a
schematic for the drive so you knew what to plug into the system BIOS.
SCSI is much easier to set up because you dont have to worry about getting
master/slave relationships right, nor do you have to be concerned about the
BIOS and whether it detected the drives heads and cylinders correctly. On top
of that, you can string several SCSI devices together (up to 7 for SCSI I, 14 for
SCSI II and III) so you can have a veritable Christmas tree of SCSI hard drives.
You have two or three issues to be concerned about with SCSI drives
though. First of all you need to be worried about properly cradling the drives
and getting adequate cooling to them. Its not a wise idea to cram bunches
of SCSI hard drives into a clone towers drive bay just because itll accept
them. Please be cognizant of the heat that a hard drive can put out and the
potential for burning up all hard drives in the system if you dont account for
cooling.
Also, youll have to make sure your SCSI IDs are correct. This is usually
quite easy to do. Most internal SCSI hard drives use jumper pins and youll
simply have to read your drives documentation to tell how to set it to the SCSI
ID youre interested in using. Typically you wont use ID 7. Thats most often
reserved for the SCSI adapter itself, hence the seven-drive SCSI I limitation.
I like to set it up so that in, say, a three-drive system, I set my boot disk
for ID 0, and the next two for ID 1 and ID 2. If you have a SCSI CD-ROM
youre hanging off the system (a pretty rare occurrence), you could set it at
ID 3. Ditto for other SCSI gear.
www.sybex.com
74
Chapter 2
IDE Devices
Finally, its important to match the speed of the drives. Older SCSI drives
operate at 7500 RPM but todays SCSI drives run at 10,000 RPM. Its not
a wise idea to hang a 10,000 RPM drive in a system with other 7500 RPM
drives. I dont think itll break anything, but youll see variations in I/O and
could experience some funny activity with the machine.
Some older cards search the hard drives counting down in SCSI ID order.
Thus, in a configuration with a hard drive at ID 0 and one at ID 4, the system
would be trying to boot to ID 4 first. This can be, as you might imagine, very
confusing. Future Domain, a SCSI card company that was purchased by
Adaptec, operated this way. Watch out for this unusual behavior!
Youll need to be cautious of SCSI I, SCSI II, and SCSI III relative to the
cabling youll have to do both internally and externally. If youve got a SCSI III
adapter in the computer but the hard drive youre trying to connect to is SCSI I,
then youll need a cable that either has an adapter on one side or is SCSI Ito
SCSI III in design. This cable rule holds true for external devices connecting to
the external SCSI port as well. You can buy cables that are specially matched like
this, or you can simply buy an adapter. It might be a good idea to shy away from
adapters if you can, though in some circumstances you may not be able to.
www.sybex.com
75
two IDE hard drives into a computer, youd still be faced with making one
hard drive the master, one the slave. Generally, in situations such as this,
the OS will live on the master hard drive and the second hard drive will be
used for data. Both of the above IDE configurations are quite common.
What happens if you need two IDE hard drives, an IDE CD-ROM, and an
IDE CD writer? Well, then youre stuck with buying a second IDE controller
card or opting for an EIDE scenario. Todays motherboards typically include
IDE connections right on the board. No matter how you connect your hardware, one device will be master, one will be slave.
When setting master/slave relationships, youll almost always have to
adjust a jumper pin on the drive itself. These are clearly labeled. Read your
drives documentation.
www.sybex.com
76
Chapter 2
IDE Devices
Summary
Between Chapter 1 and Chapter 2, you should have your disk subsystems
covered. If you are asked which is right for your implementation, you will have
a lot to think about, but you should be able to make an informed decision. As
far as the exam goes, make sure you are able to keep each of the ATA specifications straight. Fortunately, they make it relatively easy for you, just with the
naming convention. You should pay attention to things like when the 40/80
cable came into being, and how to choose a master or a slave device.
This chapter describes adding or changing hard drives in a system. You
essentially have two flavors of drives to consider: ATA/IDE or SCSI. There are
constant improvements and upgrades to each category and so you might wind
up changing out a SCSI I drive to a SCSI III that will result in I/O performance
enhancement. Telling the two types apart is easylook at the end of the drive
and count the pins. ATA/IDE is a 40-pin setup, SCSI varies from 50- to 68-pin
depending on the type of SCSI. Youll need to be aware of cabling issues with
SCSIsome installations may require a SCSI ItoSCSI III cable, for example.
Youll also have to keep in mind termination issues with SCSI. Generally
the SCSI adapter (ID 7) will be terminated and you may have to terminate the
other end of the chain as well. External devices use an external terminator,
while internal devices use jumpers for termination.
Modern servers use drive cages and ready-made slide-in devices that allow
for easy removal and upgrade of drives. These slide-in devices are proprietary,
so if youve got a slider for a Compaq computer it probably wont work in a
Dell and vice-versa.
Exam Essentials
Know that ATA and IDE are synonymous ATA is the official standard
defined term for IDE devices, though you will usually hear these devices
referred to as just IDE.
Know that IDE devices have the controller integrated into the drive
Unlike SCSI devices, which use an actual controller, the IDE controlling
device is contained as part of the drive. That is why it is referred to as
integrated.
Know the characteristics of each type of ATA device ATA was the first
type of IDE device, and was very limited in speed, addressable hard drive
size, and number of devices on the IDE chain. ATA-2 used DMA channels
www.sybex.com
Summary
77
www.sybex.com
78
Chapter 2
IDE Devices
Key Terms
Before you take the exam, be certain you are familiar with the following terms:
AT Attachment (ATA)
ATA Packet Interface (ATAPI)
ATA 100
ATA 2
ATA 3
ATA Packet Interface
block transfer mode
cable select (CSEL)
Cyclic Redundancy Check (CRC)
Direct Memory Access (DMA)
electromagnetic interference (EMI)
Enhanced IDE (EIDE)
Fast-ATA
Frame Check Sequence (FCS)
Identify Drive
Integrated Drive Electronics (IDE)
jumper
Logical Block Addressing (LBA)
master
Mean Time Between Failure (MTBF)
paddleboards
Parallel ATA
Programmed Input/Output (PIO)
Self-Monitoring Analysis and Report Technology (SMART)
Serial ATA
slave
Ultra DMA 33
Copyright 2001 SYBEX, Inc., Alameda, CA
www.sybex.com
Review Questions
79
Review Questions
1. What does IDE stand for?
A. Integrated Device Electronics
B. Integrated DMA Efficiency
C. Integral Drive Economics
D. Integrated Drive Electronics
E. Interior Device Efficiency
2. What is ATA?
A. The name of an airline.
B. The actual standard that defines IDE.
C. IDE is the standard that defines ATA.
D. A type of burst mode DMA data transfer.
3. How does PIO work?
A. With PIO the input/output (I/O) goes directly to the processor.
B. With PIO the I/O is sent directly to memory.
C. With PIO the I/O bypasses the memory and the processor.
D. With PIO the I/O is sent simultaneously to the processor and to the
memory.
4. With DMA, where does the I/O go?
A. Directly to the processor.
B. Directly to memory.
C. The I/O bypasses the memory and the processor.
D. The I/O is sent simultaneously to the processor and to the memory.
www.sybex.com
80
Chapter 2
IDE Devices
www.sybex.com
Review Questions
81
upgrading the hard drives in an older server that has been running for two
years now. There is little documentation available for this server. What is
the first thing Monica must determine before she can go forward with her
hard drive replacement?
A. How many drives are in system
B. The type of hard drives
C. The SCSI IDs of all the hard drives
D. The master/slave relationship
E. What type of hard drives are in the computer
10. What is one of the reasons the ATA specifications are getting away
up, causing the electrons to go faster. This can cause the cable to
overheat, causing fires.
C. The cable can become twisted.
D. The cable can block airflow from the fan, causing excess heat
www.sybex.com
82
Chapter 2
IDE Devices
12. Which two of the following choices describe how devices that are part
slave relationships?
A. There are usually two sets of three pins.
B. There are usually three sets of two pins.
C. There are normally six sets of two pins.
D. It varies.
15. What is the name of the device that connects the two pins to create an
I/O path?
A. Pin connector
B. Rocker switch
C. Bipolar DIP switch
D. Jumper
www.sybex.com
Review Questions
83
16. You have just been given a new IDE hard drive. When you check to see
one but you cant seem to get the hard drive to come up and be recognized. There is an IDE CD-ROM in the system as well. What could be
the problem?
A. BIOS doesnt recognize the correct cylinders and heads.
B. CD-ROM is set to be master.
C. CD-ROM and hard drive are both set to be slave.
D. Termination jumper on hard drive isnt set.
18. What is the rated data throughput for an ATA 66 device?
A. 33 MBytes/second
B. 44 GBytes/second
C. 66 MBbytes/second
D. 66 KBytes/second
E. 66 GBytes/second
19. What is another name for an ATA 66 device?
A. Ultra 66
B. Supra 66
C. DMA 66
D. EMA 66
www.sybex.com
84
Chapter 2
IDE Devices
www.sybex.com
85
called Integrated Device Electronics and in others it is called Integrated Drive Electronics.
2. B. AT Attachment (ATA) is the actual standard that defines IDE.
3. A. With PIO, all I/O goes through the processor.
4. B. With DMA, I/O is sent directly to memory.
5. B. DMA is first used in the ATA 2 specification.
6. C. In the early ATA specifications, the cable had 40 pins and 80
conductors.
7. D. In an ATA 66 cable there are 40 pins and 80 conductors.
8. D. The additional 40 conductors are used for grounding to prevent the
ing and what type they areSCSI or IDE. Once she knows what type of
drive shes dealing with, she can ascertain their SCSI IDs or the master/
slave relationship. She should also ascertain the speed of the drives, if
SCSI. Most drive and schematic information is usually available on the
manufacturers Web site.
10. D. Because of the cable width, it can block airflow from the fan, caus-
www.sybex.com
86
Chapter 2
IDE Devices
four devices.
www.sybex.com
Chapter
Verify N 1 stepping.
www.sybex.com
n the last chapter, we spent a lot of time talking about how to link
physical hard disks together to give you more disk space and a sense of
redundancy in case of a failure. Now we are going to move from the disk
subsystem to the brains of the operationthe CPU and the ways that we can
maximize effectiveness.
As you look over the objectives, you will see a lot of attention paid to grouping CPUs together, either as part of the same physical computer with multiprocessing or by taking advantage of groups of servers by clustering. Clustering is
one of those buzzwords that just wont go away. It takes the concepts of RAID,
mirroring, and duplexing to a new height. Basically, we are moving the single
point of failure back from the disk subsystem, back even beyond the server. With
cluster servers, instead of having our data and applications protected by having
an additional disk subsystem, we are providing high availability of data and
applications by having additional servers.
For complete coverage of objective 3.2, please also see Chapters 6, 8, and 9.
Clustering
lthough clustering and cluster servers are the current buzzwords, the
concepts have been around for years. Actually, the implementations have been
around for years. The mainframe, big iron people had clustering almost since
day one, and on the LAN side, Novell had System Fault Tolerance systems
back in the days of NetWare 3. So, we are not talking about new technology.
When you start talking about clustering, you are actually opening up the
discussion of disaster recovery. Now, if you have ever participated in a disaster recovery exercise, you know that it can get to be pretty intense. When you
www.sybex.com
Clustering
89
are planning for disaster recovery, the first thing you have to do is determine
how valuable your companys data is, and how long you can live without it.
Most members of senior management will tell you that the data is invaluable
and you cannot live without it even for a second. Then you start showing the
person how you can, in fact, ensure that data is always available with a
99.9999999% uptime, 24 hours a day, 7 days a week, 365 days a year. It is
an impressive display, until you get to the cost. That is when the rubber hits
the road.
So, what is clustering anyway, how does it work, and why is there the
potential for costs to skyrocket?
Clustering Basics
Clustering is basically having redundant, mirrored servers. In other words, if
one of the servers in your network were to fail over to its mirror, the other
server would immediately pick up the slack, and make all of the up-to-theminute data available to your users as well as all the applications that were
running on the failed server. In addition, for this to work really well, the
changeover should be transparent to the end user. In other words, Ursula
User would have no idea whether her requests for data and applications were
coming from Server A or Server B. Nor would she care.
So, now it comes time to determine what a disaster is and how can we protect against it using clustering, because after all, there are several different
kinds of disaster. Well, the first and most obvious example of a disaster is to
have something happen to the file server; lets say that someone was walking
through the computer room with a can of soda and tripped, spilling the soda
into the file server.
Now, okay, so this scenario may not be one that immediately jumps to mind.
But I have seen a file server that handled all services for a small law firm that
was physically located in the break room, in a small enclosure directly under
the coffee pot. Now, of all the places that I have seen servers placed, this was
the second most bizarre. The most bizarre was at a company that wanted to
prove to their customers how technologically advanced they were. To make
sure their customers could see the server screensaver when they walked into
the waiting room, that is where the server was placed. Now, if that were not
bad enough, the keyboard was attached and was active, as was the mouse.
So, anyone who came into the reception area that was really bored could
amuse herself by starting and stopping services or just rebooting the server.
We are not even going to mention the data that was available.
www.sybex.com
90
Chapter 3
Once the soda hits the file server, the smell of burnt silicon starts to permeate the building and that server is officially designated as toast. If this
were the disaster we were protecting against, our clustered server could be
mere feet away and still provide protection. In this case, just having a clustered server in the next room would be all the protection you would need.
Lets say there was a more serious problem. Suppose there was a fire in the
building that housed the file server. Now you can see that the only way clustering would work would be if the machine were physically located in a
different building, but the building could still be close by. If we moved the
disaster up in scale from impacting a single building to something like a flood,
tornado, or earthquake, now the clustered servers need to be several (or many)
miles apart to be safe. We can even take this a step further: Suppose you live
in a part of the world where political unrest is a way of life, or war is comm
onplace. In that case, you may want to have one of your clustered machines
located on the other side of the globe.
Clustering is just making sure that the mission critical business applications
and data that your enterprise requires to operate have high availability, meaning
that they are available 24 hours a day, 7 days a week, 52 weeks a year, year in
and year out. This high availability is usually necessary simply because of the
cost of operations. Lets say that you are talking about the application that runs
reservations for a major international airline. If that application is unavailable
for any reason, for any time, anywhere in the world, the loss of revenues to the
company could be in the millions-of-dollars-an-hour range.
Lets look at another example. Recently I read an article about the IS
department at the National Aeronautics and Space Administration (NASA).
It is responsible, among other things, for the computer network that tracks
the space shuttles when they are in orbit. This involves things like communication, tracking, navigation, life support, small things like that. Can you
imagine what the availability of that system must be every time a shuttle
takes off? I would imagine that having the space shuttle just take another
orbit while we reboot the server is not necessarily an option.
Clustering Technologies
Clustering offers differing challenges as you face each of the scenarios faced
above. Clustering, obviously, is a combination hardware and software solutions to the high availability challenge. This challenge may be something like
making sure that a database application is available no matter what the circumstances, or just making sure that a vital network service like e-mail is not
affected if one of the servers on the network should fail. Basically, a clustered
environment would look something like Figure 3.1.
www.sybex.com
Clustering
FIGURE 3.1
91
Networked workstations
connected to the
database server
With cluster server, you have at least two servers, working together in tandem. If something were to happen to either of the servers, the other server
would be able to take over immediately. This means that if any of the server
applications were to fail, the cluster server software would restart any configured applications on any of the remaining servers. This seems to imply
that each of the cluster servers has to be configured exactly the same way,
and that is not necessarily the case. In some implementations you may have
two different applications running on the two servers, and if one server were
to fail, the other server would start the failed application and make it available. Look at Figure 3.2, which uses a database application and an e-mail
application as an example. This is the way the cluster would look before the
fail over.
FIGURE 3.2
Networked workstations
connected to cluster
www.sybex.com
92
Chapter 3
Figure 3.3 shows the way the cluster would respond after a system failure.
FIGURE 3.3
Access to database
and e-mail maintained
But providing access to applications is only half the problem. What about
providing access to the data that can be changed and updated on a minuteby-minute basis? Going back to our example of the reservation system for an
international airline, the application is not much use if the database for flight
and passenger information is not available. Therefore each server node in the
cluster must have access to the same information so that the application and
the data can be moved from one location to another without any downtime.
One of the ways this can be done is with shared external storage. Take a
look at Figure 3.4.
FIGURE 3.4
Shared
SCSI bus
SCSI disks
www.sybex.com
Clustering
93
In this case, both servers are accessing the same storage location, linked
together with a SCSI bus or a Fibre Channel configuration. It is up to the
cluster server software to decide which node has access to which pieces of
data at any given time. In this configuration, only one node can access any
information at any time. This is one way of making sure that the data on the
external storage is not corrupted.
The disadvantage of this configuration is that due to the limitations of
SCSI, the machines must be located close together. As you can see in Figure
3.5, shared SCSI technology has a distance limitation of just 82 feet.
FIGURE 3.5
Cluster
nodes
Shared
SCSI bus
SCSI disks
www.sybex.com
94
Chapter 3
There are solutions available that, for example, can use an IP network to
bypass the limitations of Fibre Channel or shared SCSI. When it comes time
to manage the data, it is handled like this. When there are changes to the data
on the primary node, these changes are captured and are sent via TCP/IP to
the backup node. That way, there is an exact copy of the data stored on the
second disk. If for any reason the primary data storage area should become
unavailable, the data is still accessible. In some cases, the solution can actually create multiple copies of the data, so even the backup is being backed up.
In this way, if there were a problem with the home site in Minneapolis, users
in different areas of the world would not suffer. Configured applications
would be back online within minutes and the data would be up-to-theminute. This would save tons of time over solutions like tape backups, where
the data is, at best, hours old, and at worst, days or weeks old.
Clustering Scalability
When it comes to scaling the clustering solution, you get what you pay for.
In some cases, you may only cluster on a one-to-one basis, so there is little
flexibility. With other solutions, you can configure the cluster to provide a
variety of solutions. Take a look at Figure 3.6.
FIGURE 3.6
TCP/IP network
In this case, we have the most basic clustering solution where one server
is acting as the primary and the other is acting as the backup. This is the
prime definition of clustering. Any data that is written to the primary is written to the backup. If something were to happen to the primary, the fail over
would bring the backup on line and life would go on with up-to-the-minute
data. In this case, there is one primary server and one backup server.
(I know the term Fibre Channel is new and we have not talked about it yet, but
we will, later in this chapter. Right now it is just important to realize that it can be
used to link servers with storage subsystems and it has a longer distance limitation
than SCSI. We will cover the rest of the stuff later!)
www.sybex.com
Clustering
95
That is not necessarily the way it has to work. Using some software implementations, you can configure clustering so there are two primary servers and the data
replication is two-way, as shown in Figure 3.7. Now this configuration does have
a gotcha. In this case, the data has to be independent. Any data that originates on
one server can only be changed on that server. If it is changed on the backup server,
the changes will not be replicated back to the original server.
FIGURE 3.7
Now, there are other, more creative ways that you can use clustering solutions. For example, you can do what is called daisychaining clustered servers. In this case, lets say that we had some critical data in the office in the
Florida Keys. If the primary server went down, we wanted a rapid fail over,
so users could quickly pick up where they left off. That solution would
require a backup server on site, so we would not have to fight wide area network bottlenecks.
Because this data is critical and because we also understand that the Keys
are subject to hurricanes and other natural disasters, that could render the
two-servers-in-the-same-location solution worthless; we need to make
another backup copy off-site, somewhere far away. In this case, we can daisychain the servers, so there are two servers in the Keys, and another off-site,
away from potential storms and other disasters.
www.sybex.com
96
Chapter 3
Clustering Summary
Clustering is a viable solution, but the level of protection that you get
depends on the level of expenditure that you make. Some clustering solutions
that are right out of the box can handle only a one-to-one server relationship,
and even then, the servers have to be in close proximity. If you want true
disaster recovery capability where the servers are located hundreds of miles
apart, you are probably going to have to go with a specialty solution.
www.sybex.com
Fibre Channel
97
Fibre Channel
Now, one of the suggested ways that things be linked together is with
Fibre Channel. Lets take a look and see how that works, and what kinds of
things you can hook together.
www.sybex.com
98
Chapter 3
www.sybex.com
Fibre Channel
99
www.sybex.com
100
Chapter 3
Point-to-Point
You remember point-to-point from back in the Network+ class, dont you?
This is the simplest of all topologies. With a point-to-point connection, there
is a bidirectional link that connects the N_ports on two nodes. A point-topoint topology will usually underutilize the bandwidth of the communications link.
Arbitrated Loop
With arbitrated loop, we start looking at a form of Fabric topology.
If any link in the loop should fail, the communication between all the
L_ports is terminated.
www.sybex.com
Fibre Channel
101
If stations are added to the Fabric, it does not reduce the point-topoint Channel bandwidth.
www.sybex.com
102
Chapter 3
Diagram of a SAN
Server
Server
Server
Switch or hub
Fibre Channel RAID
SCSI bridge
SCSI RAID
www.sybex.com
103
Now the interesting thing about SANs is that both SCSI and IP protocols
are used to access the storage subsystems. The servers and the workstations
all use the Fibre Channel network to get access to the same sets of storage
devices or system. If there are older SCSI devices on the network, they can be
integrated into the Fibre Channel network through the use of the SCSI
bridge. What kind of performance are we talking about? Well, using a gigabit link, bandwidth is reported to be in the neighborhood of 97 MBytes/second for large file transfers.
Support for network resolution protocols like ARP, RARP, and others
he CPU is the brains of the server. It is responsible for the control and
direction of all the activities that the server participates in, using both the
internal and external buses. The CPU is just a processor chip that consists of
millions of transistors. That is what a CPU is and does. But like most things
in computing, there are dozens of processors.
When it comes to CPUs, there are only a few well-known manufacturers.
The best known, and the two manufacturers that are constantly battling it
out for the title of fastest, are Intel and Advanced Micro Devices (AMD). Of
the two, Intel is probably the more widely recognized, although AMD is
making inroads every day in the desktop and mobile computing market.
Copyright 2001 SYBEX, Inc., Alameda, CA
www.sybex.com
104
Chapter 3
Back in 1965, one of the co-founders of Intel, Gordon Moore, was preparing
for a speech when he made a remarkable discovery. He discovered that the
number of transistors per square inch on integrated circuits had doubled
every 12 to 18 months since the processor was first invented. Moore speculated that the trend would continue, and that became Moores Law. If you
examine the prediction, you will find that it has been remarkably accurate. In
recent years the trend has slowed, and Moore has revamped the law to state
that the density of data will double every 18 months. Why are we mentioning
Moores law? Simpleeverything you are about to read about processors is
outdated. Most of it was probably outdated in the time it took this chapter to
go from my desk, through the editorial process, to the printer. So, if you are
reading this and thinking, What the heck is he talking aboutgigahertz is not
the state of the art, remember that when this was being written, the gigahertz
barrier had just been broken.
Intel Processors
At the time of this writing, the primary Intel processors on the server market
were the Pentium III and the Xeon Pentium III. Intel was also about to release the
first foray into the IA-64 architecture called the Itanium processor. Since the Itanium was scheduled to be the latest, greatest, bestest, fastest server processor on
the block, it was billed as the perfect solution for the large server market, even
before it was released. That left the Xeon and the Pentium III to hold down the
fort on the mid-range and low-end server market.
Intel also has the Celeron processor on shelves, but it was designed for the
lower end, desktop market. Because it is designed for the desktop, we wont
look at it here.
www.sybex.com
105
The Xeon can use 1MB or 2MB unified non-blocking, level-two cache.
For a more in-depth look at cache and at memory in general, read Chapter 4,
Memory.
www.sybex.com
106
Chapter 3
Figure 3.9 shows a picture of the Xeon Pentium III processor; the photo
was taken from Intels pressroom at http://www.intel.com/pressroom/
archive/photo/processors.htm.
FIGURE 3.9
www.sybex.com
FIGURE 3.10
107
The Pentium III is not as expensive as the Xeon processor, and the supporting
cast of mainboard and memory will bring down the cost also.
RISC Processors
You want power, we got power. Of course, like most things in computing,
the more performance you receive, the more you pay for it.
RISC chip servers are at the high end of the server platform, usually
reserved for the high availability, highly accessed Web servers. RISC based
servers can scale from a single processor up to 64 processors in the same
machine. Of course, the cost is going to be considerably higher than the usual
$10,000 to $15,000 price range for a starting server. In the case of a RISC
server, taking the cost well over $100,000 is not unheard of.
RISC is usually associated with Unix implementations, although Windows NT also ran on the RISC platforms.
Advantages of RISC
The RISC processor does offer several advantages over its Complex Instruction
Set Computing (CISC) counterparts:
Speed The name says it all. With RISC, you are dealing with a reduced
instruction set. That means that RISC processors often show two to four
times the performance of CISC processors in comparable technology and
using the same clock rates.
Simpler Hardware Because the instruction set is simpler, it uses up less
chip space. That means that extra functions like memory management or
floating-point arithmetic units can be installed on the same chip. Also,
since the chips are smaller, there can be more parts on a single silicon
wafer, and that reduces the cost per chip dramatically.
www.sybex.com
108
Chapter 3
Shorter design cycle Since the chips are simpler, it doesnt take as long to
design as the CISC brethren. This means that RISC chips can respond to
changes in the hardware marketplace sooner than the CISC designs. This
means there will be greater leaps in performance between the generations.
RISC Summary
While RISC is exceptionally scalable and works tremendously well in servers
that going to be heavily utilized, the monetary costs can be considerable.
www.sybex.com
Multiprocessing Support
109
Multiprocessing Support
or those of you who are fans of the American comedian Tim Allen,
perhaps we should just re-title this section More Power! See, unlike Allen, I
dont think it is just a guy thing. I want to make this more politically correct,
because everyone at one time or another wants more power. Certainly the
people on your network do, every time they complain about how slow the
network is running today. One of the ways that you can give them more
power is with Symmetrical Multiprocessing (SMP).
Copyright 2001 SYBEX, Inc., Alameda, CA
www.sybex.com
110
Chapter 3
You may be asking yourself, why all this fuss about multiple processors in
a single computer? After all, with the speed of processors getting faster all the
time, wont that take care of the issue?
Multiprocessing Basics
This will be like the discussion of RAID, complete with a whole new set of
acronyms and strange terms. Bear with me and it will make sense. First of all,
why might you need SMP?
When you take a look at the world of the uniprocessor (UP), you realize that
the processor is actually doing a lot of work at the same time. For example, the
processor may have a fixed-point arithmetic unit and a floating-point arithmetic
unit all on the same CPU. That means that the processor can run multiple
instructions within the same CPU. The thing to keep in mind is that while several
instructions can be run in parallel, only one task can be processed at a time.
Look at Figure 3.11.
FIGURE 3.11
Uniprocessing
Task
Task
Task
Task
Processor
In this case, you have multiple tasks backed up behind a single processor.
Now, you are probably saying to yourself, Wait a minute, he just said that
processors can perform multiple instructions at the same timewhat is the
difference? Think of it this way. Imagine yourself drying dishes after a big
meal. Each dish is a task. You may be able to dry multiple parts of the dish
at the same time, but you cannot dry multiple dishes at the same time. Adding another processor to the mix, like bringing in another person to help, will
cut the number of tasks down by half, and speed up the process of drying the
dishes. Figure 3.12 shows what I mean.
FIGURE 3.12
Multiprocessing
Task
Processor 3
Task
Task
Task
Task
Processor 2
Task
Processor 1
www.sybex.com
Multiprocessing Support
111
Now, you would think that, like bringing in another person to help dry the
dishes, adding another processor would increase the overall performance of a
system in a directly proportional fashion. In other words, if you added a second processor to the system, the system would be twice as fast. It would be
wonderful if it worked that way, but it doesnt. You see, there a lot of other
factors that have to be taken into consideration. The problem is not just buying a motherboard that is compatible with two CPUs. All of the chipsets on the
motherboard have to be able to work with more than one CPU. The CPUs
themselves have to have hard-coded programming to work in parallel and,
once all the hardware is in place, the operating system has got to be able to
handle multiple processors. Now, all that has to happen to make sure these
two processors can work in tandem. Can you imagine how much behind-thescenes stuff has to go on to work with up to 64 processors? Not only that, but
there is still one more piece to the puzzle and that is the application.
www.sybex.com
112
Chapter 3
SMP Hardware
Obviously, when you are talking SMP, the hardware is important. With
some of the earlier Intel CPUs, you could mix and match older CPUs of close
clock speed, so you could do things like put a Pentium 166 in with a Pentium
200. You would just have to set both CPUs to run at either 200MHz or
166MHz, which of course could affect system stability.
Things are a little different with the more recent CPUs. With the more
recent systems, a multiplier is used. This multiplier is put into play to multiply the CPU bus clock rate, called the Front Side Bus (FSB) rate, by the multiplier. So, with a Pentium III 500, the FSB would be 100MHz and the
multiplier is 5, giving you the 500MHz. Intel now sets locks on the multipliers used in the CPU to control the final clock speed. Because of this, if you
are running multiple CPUs with Intel, you have to make sure the clock speed
and the multiplier are the same.
The CPU also has to have the onboard circuitry to work with other CPUs in
the same system. If that circuitry is not there, the CPU will simply not take
advantage of the other CPUs. This should not be a worry, because all of the Intel
CPUs that have been developed since the early Pentiums have had the ability to
work and play well in an SMP environment.
Intel does make the distinction between the dual processing (DP) environment and the multiprocessing environment (MP) for chips that are marked
with a VSU. If your Pentium chip has the VSU marking, it means the chip
has been validated to work in a uniprocessing and multiprocessing environment, but not in a dual processing environment. The difference is that DP is a
special mode of operation for two Pentium processors where there are four
dedicated private pins and there is the specific DP on-chip circuitry. This circuitry allows the processors to handle the negotiation of how to use the
resources and the data buses. Since there is no operating system intervention
required, this is referred to in Intel literature as a glueless solution. An MP
setup requires the glue like the operating system to negotiate between the
processors.
There are limitations to the number of processors that can be used. For
example, some Pentium IIIs can only be used in pairs, while the Pentium III
Xeon can be used in eight-CPU configurations.
Operating Systems
Having the hardware able to recognize the multiple CPUs is one half of the
battle, but the other half of the battle is the operating system. The operating
Copyright 2001 SYBEX, Inc., Alameda, CA
www.sybex.com
Multiprocessing Support
113
system has to be able to figure out that there is more than one CPU present
and also load the proper kernel. The kernel must then be multithreaded to
take advantage of the multiple CPUs inside the OS. This is really a bigger
issue than it may sound, because many of the system calls are static and cannot be reconfigured to work in a multithreaded environment. In that case,
some locks have to be put in place to make the system calls static.
The operating system is also responsible for system stability. It has to
manage the caches of the different CPUs. That management can get tricky,
because it has to make sure that the contents of the cache match each other,
as well as the original data, whether it is stored in RAM or on a disk. This
is one of the major hurdles for running multiple CPUs.
The OS must also support all of the processors that are available in the
hardware. An example would be that Windows 2000 Professional only supports 2 CPUs, so if you ran it on a system where there were four, two
wouldnt be used. Windows 2000 DataCenter Server supports 32 CPUs out
of the box. If you are using Linux, some of the Linux kernels natively support
16 CPUs, although the kernel code can be rewritten so more than 64 processors will be recognized.
You may also see the term processor or CPU affinity. This is the process of
selecting specific applications or processes to run on a specific processor. For
example, you may have a quad processor machine and want a database
indexing function to run specifically on the fourth processor. This would be a
function of affinity.
www.sybex.com
114
Chapter 3
www.sybex.com
Adding Processors
115
Adding Processors
Adding processors can be a much less scary prospect if you do some basic
research on the server youre upgrading prior to attempting the addition.
There are several key things to keep in mind as you toy with the idea of
upgrading or adding new processors.
Verify N 1 Stepping
In the world of CPU manufacturing, the word stepping is akin to a version
number. When a new microprocessor is released, the product version is set at
step A-0. Later on, as engineering updates are made to the chip, new steppings
are assigned. If the change is minute, the number of the stepping will be
changed (i.e., A-0 to A-1). If the change is major, the letter of the stepping will
change (i.e., A-0 to B-0).
When considering a CPU upgrade, especially if youre adding a CPU in
order to turn the system into a multiprocessor computer, youll want to verify
the current CPUs stepping and match accordingly, or replace if the stepping
levels are too far from one another. Check with the computer manufacturer or
vendor for more detailed compatibility information.
In a single CPU upgrade, the same caveats apply (matching the stepping
to the range supported by the manufacturer).
www.sybex.com
116
Chapter 3
Note that some processors require a DC power supply and have an associated
slot on the motherboard for the power supply unit. Note for sure whether your
processor implementation has this and order accordingly. If in doubt, ask the
manufacturer or consult documentation.
Summary
You know the problem with writing a chapter like this? As you write
about all the exceptional technology, you just want to go out and set up a
cluster of servers with four Xeon processors, a couple of gigs of RAM, and
a storage system hooked to the main box by Fibre Channel with a few
terabytes of disk space, just to see me if you can get it to work! Hmm, maybe
I could build it and sell it to my wifes company. Do you think that might be
a little overkill for a home-based business with 10 employees?
Anyway, enough of all this dreaming stuff. On to Chapter 4, where we
look at what kinds of memory to put into that big ol badboy.
You know, I could probably get all those servers to fit.
We talked about adding processors to a system. Its important to verify
your systems capabilities, either by checking with the manufacturer or consulting system documentation. Systems are typically rated for a given range
of microprocessors so you may not be able to run out and buy the latest and
greatest processor, slap it in your system, and hope that it works. Its important to understand your systems limitations.
www.sybex.com
Summary
117
Exam Essentials
Know what it means to cluster servers Servers are clustered for a variety
of reasons, usually to make sure that the single point of failure is moved
back beyond the server. You can think of cluster servers as mirrored servers, though in reality, clustering can provide a broader range of services
than just fault tolerance.
Know what high availability means High availability is one of those
buzzwords that means exactly what it implies. You want your network to
be available always, 24 hours a day, 7 days a week. You take all the steps
necessary to make sure your server is up and running to provide the
appropriate services and applications to your users. It is highly available.
Know the basics of Fibre Channel Fibre Channel can be used to link
storage subsystems (or other devices) to the network. It provides faster
throughput. Fibre Channel makes use of ports connecting using the Fibre
Channel fabric. Fibre Channel is used in storage area networks, to provide
the bandwidth for remote access to large databases, and to provide bandwidth for remote backups.
Know about the different types of CPUS, including RISC, Pentium II,
Pentium III, and Xeon Xeon supports two-way processing and multiprocessing. Xeon supports four-way multiprocessing without specialized
chipsets. They are more expensive than the Pentium III processors.
Know which CPU you would use in a high availability super server
The RISC processor and the Xeon are designed for high availability and
high utilization servers.
Know the advantages and disadvantages of multiprocessing support
Before adding multiple processors, it is best to do a cost analysis. In some
cases, it may be cheaper to add another server with fewer processors than
it is to add a mainboard that can support more processors. For example,
it may be cheaper to provide a cluster of four servers with two processors
each than a single server with eight processors.
www.sybex.com
118
Chapter 3
Make a checklist Know and understand the things to check for when
upgrading a system processor or adding processors to a multiprocessor
system.
Key Terms
Before you take the exam, be certain you are familiar with the following terms:
American National Standards Institute (ANSI)
arbitrated loop
channel
cluster servers
Complex Instruction Set Computing (CISC)
Cross-point
dual processing (DP)
F_port
FL_port
Fabric
Fabric Switched
Fibre Channel
Front Side Bus (FSB)
gigabit
kernel thread
Link Control Facility (LCF)
N_port
NL_port
point-to-point
Reduced Instruction Set Code (RISC)
stepping
Storage Area Network (SAN)
Streaming Single-Instruction, Multiple-Data Extensions
www.sybex.com
Summary
www.sybex.com
119
120
Chapter 3
Review Questions
1. When servers are clustered, you are providing redundancy of which
devices?
A. Network cards
B. Mainboards
C. RAID systems
D. Servers
E. Video cards
2. What are the key items that must match when youre attempting to
thats had better days. You need to verify the CPU stepping. How do
you go about gathering this information?
A. Read the serial number on the CPU and call the manufacturer.
B. Read the stepping number on the CPU.
C. See if the NOS reports the stepping number.
D. Obtain the stepping number from the Web.
E. Read the system documentation.
www.sybex.com
Review Questions
121
server. When the server boots, only one CPU reports online and the
NOS error logs report something about an L2 problem. What does
Louis need to check?
A. Secondary cache is mismatched between the two processors.
B. BIOS version is different between the two processors.
C. CPU speed is different between the two processors.
D. DC power converter missing.
6. The Linux 2.2 kernel can use up to 64 processors if what is done?
A. The kernel is rewritten and tweaked.
B. Nothing.
C. The processors are set to operate in parallel mode.
D. The processors have VPU on them.
7. What is the minimum number of servers in a cluster?
A. 4
B. 3
C. 2
D. 1
www.sybex.com
122
Chapter 3
www.sybex.com
Review Questions
123
13. Alejandro is trying to add a second processor to his server but the
old one.
C. Buy a new processor, because the mainboard will burn out the
old one.
D. Flash the BIOS to provide for the extra instruction sets of the
Pentium III.
www.sybex.com
124
Chapter 3
16. When the Pentium III first came out, there was some controversy
species list.
B. The floating-point decimal was not always accurate.
C. It had ID tracking.
D. It was thought to gather an inventory of hardware and software on
ing two more processors to a server that already has two, thus turning
it into a four-way computer?
A. Stepping of all four processors must match.
B. L2 cache of all four processors must match.
C. Speed of all four processors must match.
D. Must have ports available on motherboard for additional processors.
18. What is the minimum number of nodes in an arbitrated loop?
A. 4
B. 3
C. 2
D. 1
19. What agency certificated the specifications for Fibre Channel?
A. ASCII
B. SCSI
C. ANSI
D. EPSIDIC
E. IEEE
www.sybex.com
Review Questions
www.sybex.com
125
126
Chapter 3
rewritten.
7. C. You must have at least two servers in a cluster.
8. B. You can have two primary servers. It is important to note that the
www.sybex.com
127
invasion of privacy.
17. B, C, D. The stepping number isnt nearly as important as the speed
www.sybex.com
Chapter
Memory
SERVER+ EXAM OBJECTIVES COVERED IN
THIS CHAPTER:
3.4 Increase memory.
www.sybex.com
Memory Types
ou have to remember where I am coming from. The very first computer that I ever bought came standard with 512K (thats right, K) of Random Access Memory (RAM). Now, I will never forget the look on the
salesmans face when I told him that I wanted to upgrade my system to 1
megabyte (MB) of RAM. He thought I was nuts. He actually told me that
I was throwing away my money, because there would never be a use for that
much memory. Now, some operating systems have minimum suggested
requirements of 128MB for installation. Guess my friendly computer salesperson was wrong!
www.sybex.com
Memory Types
131
So, why is memory so important? If you want to speed up the performance of any PC or server, one of the first things you can do to it is add more
memory. As a matter of fact, that is a pretty common solution to server problems. It is always easier for a Central Processing Unit (CPU) to grab information out of memory than it is for the CPU to have to go look for it on a
hard disk, or in its instruction set. So, the more of the commonly referenced
information we can store in memory, the quicker the CPU can find it. The
faster the CPU does its job, the faster the server (or even a workstation)
appears. It is as simple as that. How does the CPU know what is commonly
referenced information and what isnt? It doesnt. So it just stores as much of
the stuff that people have asked for as it can. When people (or the system)
ask for stuff that it does not have in memory, it will usually rid itself of old
stuff that no one has asked for in a while and replace it with the new stuff
people (or the system) have recently asked for.
DIPs
Memory comes in various shapes and sizes, so lets start by taking a look at
some of the physical types of memory. I mentioned above that I had to
upgrade my first computer from 512K to 1MB of RAM. This involved a
technician adding some integrated circuits called dual inline packages
(DIPs) to the mainboard. These types of DIPs are shown in Figure 4.1. DIPs
have come in a variety of sizes, but now they are usually at least 256K per
DIP. To be honest, I have no idea what they were when I bought my first
computer, because for the first year I owned it, I was afraid to take the top
off for fear all the electrons would escape.
FIGURE 4.1
www.sybex.com
132
Chapter 4
Memory
DIPs are still used for a variety of memory. For example, VGA cards or network
cards that have onboard cache will normally use DIPs.
SIMMs
The SIMM was just a different configuration of the DIP. Two types of SIMM
are shown in Figure 4.2.
FIGURE 4.2
SIMM memory
www.sybex.com
Memory Types
133
SIMM installation was actually a little more difficult than it sounds. Many of
the mainboards had plastic connectors, and if you were not careful, you could
break off the plastic. When that happened, the SIMM was not held securely in
its slot and it did not work well. This was usually time for a new mainboard,
and those were always expensive. For a while, I worked as a telephone technical support person, and my job was to talk people through the installation
of memory SIMMs. As a technician, I always warned the installer to be really
careful, and I just hated it when I heard something like, Oh darn, look what I
did coming out of the phone.
Getting back to Figure 4.2, you can see that the SIMM, depending on age,
comes in two different configurations. There was the 30-pin configuration
and the 72-pin configuration. When the 30-pin SIMMs first came out, computers were working with 32 data bits. Unfortunately, each SIMM only handled 8 data bits, so you needed to provide one bank of four SIMMs. A
memory bank was simply a set of four slots. Most computers had two banks
of four SIMMs available, Bank 0 and Bank 1. The CPU would then address,
or work with, one memory bank at a time.
72-pin SIMMS took care of part of the problem, because each 72-pin
SIMM supported 32 data bits. If you were using a 486 CPU from Intel or a
68040 from Motorola, you only needed one 72-pin SIMM per bank to give
the CPU the 32 data bits it was looking for.
Working with the early computers was always fun, because they never ceased
to provide unique opportunities. One of the opportunities was something
called chip creep. If you remember all the way back to high school science,
when things heat up, they expand; when they cool down, they contract. The
same is true with chips. After a computer had been turned on and off several
dozen times, the chips, which had expanded and contracted several dozen
times, may have worked themselves just ever-so-slightly out of their slots.
That meant the chip was not making proper contact and the thing didnt work
as advertised. As a user, you became adept at taking the top off your computer and gently pushing down on all the chips to reseat them.
www.sybex.com
134
Chapter 4
Memory
DIMMs
After the SIMM came the Dual Inline Memory Module (DIMM). Look at
Figure 4.3.
FIGURE 4.3
SO DIMM
168-pin DIMM
As you can see, there are two types of DIMM, but most of them installed
vertically into the mainboard, just like the SIMM. The difference between
SIMMs and DIMMs is in the pin configuration. On a SIMM, the opposing
pins on either side of the board are tied together to form a single electrical
contact. With a DIMM, the opposing pins remain separate and isolated to
form two contacts. DIMMs therefore usually have memory chips on both
sides of the module. DIMMs are used in 64-bit computer configurations.
This relates to the Intel Pentium or the IBM RISC processor.
At the top of Figure 4.3 is the Small Outline DIMM or the SO DIMM.
This DIMM is like a 72-pin SIMM in a reduced size. It is designed primarily
for laptop computers.
Next is the 168-pin DIMM. If you look carefully at it, you will notice the
notches in each side of the module. Instead of having to install this module
by inserting it at a 45-degree angle and rocking it back, this module slides
into its slot with rocker arms on each side. You start the installation by opening the rocker arms, and when you push the DIMM into the slot, the rocker
www.sybex.com
Cache Memory
135
arms close and lock the module down. The rocker arm will then hold the
module firmly in place, eliminating any chip creep.
Now that we know what memory physically looks like, lets see how it
is used.
Cache Memory
For your basic server, there are two types of memory: cache memory and
main memory. Main memory is referred to by a variety of names, including
Dynamic Random Access Memory (DRAM) or just plain ol RAM. DRAM is
the part of memory that is responsible for holding instructions and for holding
data that will be used by the applications running on your server. It is also used
by the server operating system itself. When the servers CPU executes an instruction from an application, it goes out to RAM to see if there is information stored
in memory that it can use. DRAM is kind of the holding area for information
that may be accessed in the near future. Depending on the server, the amount of
DRAM can measure in the gigabytes.
There is another type of RAM, called Static Random Access Memory
(SRAM). Your first question is probably, Wait a minute. How can it be
static and random at the same time? Good question! SRAM is called static
because the information doesnt need to be updated very often. With memory, this update process is called a refresh. SRAM is usually physically bulky
and limited in its capacity. SRAM usually comes in a DIP. SRAM can be used
for cache. Lets start looking at cache and then we will explore the different
types of main memory.
Cache comes in much smaller amounts and it is much faster than main
memory. It is usually measured in the kilobyte range. The express purpose of
cache is to make it easier for the component to respond to request for services. Cache memory is used for the processor, it is used for RAID controllers, and it is even used for some types of network cards. In this section, we
are going to look at how a processor uses cache, how RAID uses cache, and
the differences between write-back and write-through cache. (You
might have noticed these quoted terms can also be spelled write-back and
write-thru. Either way, they mean the same things.)
Processor Cache
When a processor wants to access information, it wants that information as
quickly as it can get it, by using the fewest number of clock cycles. When you see
listings for the cache memory that will be used expressly for the processor, note
Copyright 2001 SYBEX, Inc., Alameda, CA
www.sybex.com
136
Chapter 4
Memory
that it comes in either Level 1 cache (L1), or Level 2 cache (L2). Level 1 cache
is physically in the actual processor itself. Level 2 cache is usually part of the
mainboard and is dedicated strictly to providing memory for the processor.
Cache memory is always SRAM. SRAM can be on a DIP, a SIMM, or a
DIMM. The cache memory controller is the brains of the cache memory system. When the cache memory controller goes out to get an instruction from
the main memory, it will bring back the next several instructions and keep
them in cache, also. This happens because it is very likely that these instructions will also be needed. Because the instructions are already loaded in
memory, when the CPU makes the call for them, the instruction will be read
from cache, making the computer run faster. When the computer runs faster,
the user is happier and the network administrators life is easier.
So whenever you see the term cache, remember that this is just another
way to speed things up. Any time something can be read from memory,
rather than having to go to the hard disk or to the BIOS to find the information, it is going to take less time. Cache is just a segment of memory that has
been reserved by the component involved to temporarily store information
or instructions for faster retrieval.
Now, another place where cache memory is used is in RAID systems.
RAID Cache
RAID cache is a perfect server implementation for cache to come into play.
Think about it. You have several hundred people trying to access information from a RAID system or write information to the subsystem, almost
simultaneously. Now, I dont know about the users you have dealt with, but
the users on my network never understood the word patience. The biggest
complaint, it seemed, that people had was the speed of the network. Certainly, those few extra seconds were going to materially affect the standard
of living of some of these people!
Anyway, assume that we are looking at a busy server, or a server that is
dealing with lots of small I/O reads and writes. In this case, if the RAID controller were to become overwhelmed with work, it might have to put some
of the requests on hold while catching up. This is usually not a good thing.
So, the RAID controller uses cache as sort of a waiting room for requests. If
it cannot answer the request immediately, it may place the request in cache
until it can get to it.
As you imagine the differences between L1, L2, and RAID cache, notice
that one of the biggest differences is in size. L1 and L2 are measured in kilobytes while RAID cache is measured in megabytes. As a matter of fact, several
RAID controllers have minimum sizes before the cacheing will kick into effect.
www.sybex.com
Cache Memory
137
Write-back memory, on the other hand, has the CPU updating the
cache during the write, but the actual updating of the main memory is
postponed until the line that was changed in memory is discarded
from cache. At that point, the data that has been changed is written
back to main memory.
www.sybex.com
138
Chapter 4
Memory
Main Memory
Now we have to look at how all the other memory within the system
works.
So, what do you know about main memory so far? Just that the CPU uses
memory to store information and instructions that it may need later. The
more memory you have, obviously, the more instructions or information can
be stored there. But how is it stored there? We have already looked at one
memory technique when we looked at cache. Lets take a look at a couple of
other ways that memory is used. We are going to look at paged memory,
interleaved memory, and shadow memory before we get into error correction, parity, and all that fun stuff. Some of this stuff may not be part of the
objectives, but, for example, you have to understand how paged memory
works before you can understand why interleaved memory is better.
Paged Memory
A year ago, I bought a file server to use in my lab. When I bought the server,
one of the things that the base server was short on was memory. When I
checked the vital statistics on the server, the marketing information said that
it takes just plain old standard memory, so I figured memory is pretty cheap.
I can slap a few DIMMs in there and bring it up to where I want it to be.
Since it is a lab server, it is older and I bought it for a very good price, so I
figured I could add memory without a problem. When I received the server
and also received the technical specifications, it called for fast paged mode
(FPM), error correction code (ECC) memory. That stuff is pricey. Now,
instead of looking at $150 for a 128MB SIMM, I was looking at $500 for a
pair of SIMMs that equal 128 MB. So I did some research and this is what
I found.
Typical memory access by the memory controller is handled in a way that
is similar to reading a book. Just like reading this book, if you want information on paged memory, you access this page. With memory, if it wants
access to certain information, it just accesses the memory page. Once the
page has been accessed, then the information can be gathered in. This process works just great when you are talking about workstations that dont
necessarily have to access information out of memory a lot. When you start
talking about a server, this is a different story. See, with straight paged mode
memory, every time the system wants a bit of information it has to go and
access the appropriate page. With fast page mode memory, this delay is overcome by letting the CPU access multiple pieces of data on the same page,
www.sybex.com
Main Memory
139
without having to relocate the page each and every time. This works as long
as the read and the write cycles are on the loaded page.
Fast page mode has certain benefits. For example, there is less power consumption because the pages will not have to be located or sensed each time.
I am pretty sure that the implementation of FPM will not amount to a massive reduction in our electric bill. FPM also has some drawbacks, not the
least of which is price. So, if you can, avoid my mistake, and avoid FPM.
Paged mode memory, on the other hand, simply divides up the RAM in
your system into small addressable groups or pages. The pages can be from
512 bytes to several kilobytes long. The improved memory management on
mainboards has now advanced to the point that it is very similar to fast page
mode, where subsequent memory access from the same page is accomplished
without the CPU having to wait for the memory to catch up. This is referred
to as zero wait state. If the access does take place off the current page, there
may be one or more wait states added while the new page is located.
Now dont confuse paged mode memory with the way Microsoft
Windows 2000 and Novells Netware 5.1 use page files to increase memory.
Before we get into interleaved memory, lets look at that.
www.sybex.com
140
Chapter 4
Memory
Page files work like this. When your server comes up, the network operating system (NOS) takes a look at the amount of free space that you have
on your disk subsystem and the NOS then takes part of that free space and
creates what is called a page file. The page file is only for the use of the operating system. This isnt a high-level secret place for network administrators
to store stuff. Now, as we have seen, as the server gets busy, it takes information that it needs, and moves that information into memory. Because it is
a busy server, the length of time instructions can stay in memory may be
exceptionally limited. When it comes time for the information to be flushed
from memory, the system has two choices: It can flush the information from
memory, so the next time it needs that information it can go back to the
application to locate it, or it can move the information to the page file, where
it will be more readily available.
Page files and virtual memory have a language all their own. For example, a
page fault occurs when your system is looking for information in RAM and
cannot find it, so it has to refer to the page file. This is referred to as a page
fault. Page faults then come in two varieties; soft page faults and hard page
faults. Information in memory is stored in frames. When the information is
moved to page files, there has to be a place to temporarily keep that data, and
these are called free frames. The plan is that these frames will be moved into
buffers and then written to disk before replacement data comes along. If a
page fault occurs, and the data is in one of the free frames that has not actually
been written to disk, this is called a soft page fault. If the data has already been
written to disk, it is a hard page fault. Soft page faults are handled more
quickly than hard page faults.
Page files are just near-line storage for information that otherwise would
be stored in memory. So, take a look at Figure 4.4 and you will see how the
CPU uses its memory.
www.sybex.com
Main Memory
FIGURE 4.4
141
CPU
Cache
Main memory
Disk
Page files, or disk swapping, are part of a concept called virtual memory.
The virtual memory concept works like this: In a 32-bit computer, the maximum amount of memory that can be conceived is 4GB. The page file system
or the disk swap space is just an area on the hard disk that can be used as an
add-on to the main or physical memory. The page file then is all the memory
that can be used, and the physical memory is the memory that physically
exists. Look at Figure 4.5 and see if that makes it any clearer.
www.sybex.com
142
Chapter 4
Memory
FIGURE 4.5
Virtual memory
Disk swap
space
Main
memory
So, with virtual memory space, you are dealing with an address that can
be conceived of but doesnt really correspond to any real memory. If something tries to access it, that attempt generates an error. With page file or swap
file space, if the address is read, the information is on the disk, so it has to be
moved to main memory. This is faster than searching an entire disk because
the memory table has the actual disk location mapped.
Finally, there is the main memory. When the processor wants something
from main memory, it is available immediately.
So, page file memory isnt really memory, kind of. It is virtual memory.
Lets get back to the real stuff and look at interleaved memory.
Interleaved Memory
In the eternal quest to make things faster, the next step up the memory food
chain is interleaved memory. The whole reason for using interleaved memory is
that provides faster response time than paged memory. Check out Figure 4.6.
This is the way paged mode memory accesses information, one step at a time.
www.sybex.com
Main Memory
FIGURE 4.6
143
Non-interleaved memory
CPU
Bus
Cache
Bus
Memory
Compare that to Figure 4.7, which shows the way interleaved memory
accesses four memory chips.
www.sybex.com
144
Chapter 4
Memory
FIGURE 4.7
Interleaved memory
CPU
Bus
Cache
Bus
Memory Bank 3
Bus
Memory Bank 2
Bus
Memory Bank 1
Bus
Memory Bank 0
Interleaved memory combines two banks of memory into one. The first
section of memory is even and the second is odd, so memory contents are
alternated between these two sections. When the CPU begins to access memory, it has two areas that it can go to. With faster processors, they dont have
to wait for one memory read to finish before another one can begin. This
means, for example, that memory access of the odd portion can begin before
memory access to the even portion has completed.
The good news is that interleaving can double your memorys performance. The bad news is that you have to provide twice the amount of memory in matched pairs. Just because your PC says it uses interleaving and
allows you to add memory one bank at a time, do not be confused. The computer is simply disabling interleaving and you may notice a degradation of
system performance.
www.sybex.com
Main Memory
145
Shadow Memory
Besides the various types of RAM being used on your server, there is also
memory that is read only. Not surprisingly, it is referred to as Read Only
Memory (ROM).
Testing Tip: If you are like me, once you walk into a testing room, you start to
freeze up and question everything, thereby confusing yourself! And, I tend to
forget what it is that certain things dofor example, Read Only Memory. One
of the things that I have found helpful is to pay close attention to what the
words mean, because unlike marketing or management speak, computerese
tends to be very descriptive. I mean, when you see ROM, if you know the acronym stands for Read Only Memory, you have a really good clue what that stuff
is used for. If it had been named by someone in marketing or management, it
would have been called something like silicon-enhanced, integrated longterm memory paradigm used only for perusal and not for continuous reconfiguration in this regard unless we have at least three meetings. You get my
point!
ROM devices are things like the Basic Input/Output System (BIOS) on
your mainboard. These devices tend to be very slow, with access times in the
several hundreds of nanoseconds. Because your CPU is much faster than
that, ROM access requires your CPU to go through a large number of wait
states before returning instructions, and that just slows down the whole systems performance. How big of a deal is that? Well, think of the things that
have their own BIOS:
Mainboards
Video cards
SCSI controllers
These are things that will be accessed very frequently, so you can see where
it could become an issue. Some computers use a memory management technique called shadowing. When shadowing is employed, the contents of
ROM are loaded in to an area of the faster RAM during system initialization. Then the computer maps the fast RAM into memory locations used by
the ROM devices. After that is done, whenever the ROM routines have to be
accessed, the information is taken from the shadowed ROM rather than
accessing the actual IC. In this way, the performance of the ROM can be
increased by more than 300%.
www.sybex.com
146
Chapter 4
Memory
If you pause a minute and take a look at the big picture, you are going
to see that we are talking about some pretty serious stuff. We are talking
about information sets on how the computer will operate, as well as program
information, and data is being moved into and out of memory at a rapid rate.
If that information is not moved correctly, nothing works properly, and your
life is not very much fun. So, it is vitally important that all of the instructions
and all of the data remain error-free. Think about all the things that can
result in corrupt instructions:
Electrical noise
Component failure
Video problems
www.sybex.com
147
Parity in the memory subsystem works like this: When a byte is written to
memory, it is checked, and a ninth bit is added to the byte as a checking or
parity bit. When the CPU needs to access the information from memory, the
CPU runs the numbers and calculates the expected parity bit. At that point,
the parity bits are compared and, if they match, the information is deemed
correct. If the parity bits do not match, the system comes up with an error
and, depending on the sophistication of the system, it may actually halt.
Every byte is given a parity bit. If you are working with a 32-bit PC, there are
4 parity bits for every address. If the PC is a 64-bit model, the number of parity bits increases to 8.
There are two types of parity: even parity and odd parity. With even parity, the parity bit is set to 0 when the number of 1s in the byte is even. That
will keep the number of 1s in the calculation even. If the number of 1s in the
byte is not even, then the parity bit will be set to 1, thus making the number
of 1s even.
The reverse is true with odd parity. In this case, the system wants to make
sure there is always an odd number of 1s in the byte. So, if the number of 1s
in the byte is odd, the parity bit is set to 0. If the number of 1s in the byte
is even, the parity bit restores order by being a 1.
If you look at this, you are going to notice that even and odd parity are
exactly opposite, and that is OK. It does not matter in the greater scheme of
things.
Like most things that are simple and are free, parity has some shortcomings.
First of all, when it discovers a problem, it cannot fix the problem. It only
knows that one of the bits in the byte has changed, but it doesnt know which
bit and it doesnt know if it changed from 0 to 1 or from a 1 to a 0. Also, what
happens if 2 bits are corrupted? If a 0 gets changed to 1 and another 0 gets
changed to a 1, as far as parity is concerned everything is wonderful.
Given this scenario, like most things in computing, someone decided there had
to be a better way, and that better way was called Error Correction Code (ECC).
ECC Memory
Like everything else, memory schemes evolve, and people whose priorities
are high availability and high reliability understand that higher cost usually
follows. ECC memory works in conjunction with the mainboard memory
controller to add a number of ECC bits to the data bits. Now, when data is
read back from memory, the ECC memory controller can check the ECC
data read back as well.
www.sybex.com
148
Chapter 4
Memory
This means that ECC memory is superior over memory with just parity
for two reasons. First, ECC memory can actually correct single-bit errors
without bringing the system to a halt. It can also detect when there have been
2-bit, 3-bit, or even 4-bit errors, which makes it a very powerful detection
tool. If there is a multi-bit error detected, the ECC memory will report the
error and the system will be halted.
There is some additional overhead with ECC. It takes an additional 7 or
8 bits to implement ECC.
Have you ever wondered if there was a way to determine whether your system has parity or ECC memory? There is. All you have to do is count the number of memory chips on each module. Parity and ECC memory modules have
a chip count that is divisible by 3. Any chip count not divisible by 3 indicates
that the memory module is non-parity.
www.sybex.com
149
Unbuffered memory
Unbuffered memory talks directly to the chipset controller. There is nothing
standing between the memory module and the controller. Therefore, information is written quickly to memory, with very little overhead.
Buffered
Buffered memory is a DIMM that has a buffer chip on it. If you are using a
DIMM with lots of chips on it, it requires a lot of effort on the part of the system
to write information into memory. Some manufacturers will use a re-drive
buffer on the DIMM to just boost the signal and reduce the load on system. The
buffers are overhead and therefore they introduce a small delay in the electrical
signal.
Registered
With registered memory, the DIMM contains registers that will re-drive or
enhance the signal as it goes through the memory chip. Because the signal is
being enhanced, there can be a greater number of memory chips on the
DIMM. Registered memory and unbuffered memory cannot be mixed.
Just like buffered memory, registers slow things down. Registers delay
things for one clock cycle to make sure that all communications from the
chipset have been collected. This makes for a controlled delay on heavily
used memory.
www.sybex.com
150
Chapter 4
Memory
answer she liked, but she did it, and lo and behold, the number of calls
decreased and her skill level increased. She began to see the humor in the
whole thing, and that is why she gave me a shirt monogrammed with RTFM,
and the note that said I should wear that during difficult classes.
Now this really is a do-as-I-say-and-not-as-I-do situation, because I have
been there, done that, got the T-shirt, and therefore should not have to read
the manual. Every time I take that attitude, I am immediately shot down by
doing something incredibly stupid (and usually costly) to prove the point.
So, let me put it to you this way. Whether you have just gotten a new copy
of the SuperWhizBang 6000 Operating System, or you need to put a card in
a computer, it never hurts to check the hardware compatibility list to see if
that card will actually work in the system. Or, if you really want to be daring,
you can read the compatibility list before buying the card, thus saving yourself time and frustration. These things are written for a reason, and they are
usually on the Internet or come with the program. Check to make sure your
system meets minimum requirements and you will save yourself tons of
headaches later.
www.sybex.com
New Stuff
151
Now, when I am buying for my lab, the worst thing that is going to happen is that
I will have a stack of components on my bakers rack that will probably never see
the inside of a server. If I were doing this at a client site, I would have wasted the
clients money and time, not to mention shooting my credibility! When I make a
purchase for a client, I am very careful to make sure that the component appears
on the HCL. If the proposed solution does not appear on the HCL, I then check the
manufacturers web site to see if there is support available. If there is alleged support available, I then download the drivers and test the component in a lab
machine before trying to install it in a production environment. Remember, we
want our servers to be high availability, and if we take a server down, we have to
make sure that the time out of service is used to the best advantage.
New Stuff
here are some other memory technologies whose names you may run
into that we havent covered here: Rambus (RDRAM) memory, Double Data
Rate SDRAM (DDR SDRAM), and IBM Memory Expansion Technology
(MXT). Since two of these types of memory require special mainboards, it is
important that you know what the specifications mean before you fill out a
purchase order for new memory for your server.
RDRAM
RDRAM is an Intel invention that got off to a rocky start. And its life hasnt
been too great either. Rambus was originally supposed to be the next great
memory advance, but then it got bogged down in life. Delivery was late,
there were squabbles between Intel and memory manufacturers that led to
lawsuits, and then when Rambus finally did hit the market, performance was
nowhere near expectations. In published reports, Intels own benchmarks
showed that less-expensive SDRAM technology running at 133MHz outperformed RDRAM running at 800MHz.
When Intel brought RDRAM to market, they wanted the manufacturers
to pay a licensing fee to Intel for the technology. Well, since the margin in
memory is nonexistent and the traditional memory shopper is looking for
price as well as performance, this strategy did not go over well.
www.sybex.com
152
Chapter 4
Memory
DDR SDRAM
DDR SDRAM is like normal SDRAM in many ways. For example, it works
with the front-side bus clock in the system. The memory and the bus run
instructions simultaneously. This means that, as bus speeds have increased,
so has system performance.
The big difference between the two is the way that DDR reads the data.
It has found a way to effectively double the speed of the SDRAM. This means
that if the data rate is usually 133MHz, DDR will transfer data at a rate of
266MHz.
DDRs also come in DIMMs, but they will not fit in the standard SDRAM
slot so you have to use a specially designed mainboard. The same problem
with configuration carries over to the laptop market. The SO DIMMs will
need a specially designed mainboard also. The DIMMs will have different
notchings and a different number of pins.
DDRs come in ECC for servers, and non-ECC for workstations.
DDR vs Rambus
In a study done by InQuest Market Research in November 1999 (http://
www.inqst.com/ddrvrmbs.htm), it was reported that the performance differences were negligible between Rambus and SDRAM.
Yeah, but that is SDRAM. What are the performance statistics for Rambus and DDR? InQuest used a benchmark called the SteamD that has been
released by the University of Virginia. This benchmark is designed to evaluate the bandwidth of memory to the processor. The margin of error for this
benchmark is less than 1%. In this study, DDR beat out Rambus by a significant margin in all tests, exceeding 30% in some cases and averaging 24.4
% performance advantage for this benchmark.
There is another version of the testing suite, this one to show the memory
types that work with Windows. This benchmark is WSTREAM.EXE.
According to the developers, the compound precision error rate is in the
range of 30%, and the developer has said that the program is inaccurate
under Windows NT 4. In tests using Windows 98, InQuest showed that the
DDR performance advantage had decreased to just 2.7%.
www.sybex.com
Increase Memory
153
Keep in mind, this study was done in the fall of 1999, and the way memory
technology has changed, all bets could be off by the time you read this. As a
matter of fact, at the time this study came out, neither Rambus or DDR had
been released to the general public yet. Do your research before filling out the
purchase order for any new technology.
Increase Memory
www.sybex.com
154
Chapter 4
Memory
www.sybex.com
Increase Memory
155
you cant go putting 70ns DIMMs in the computer and hope that things work
correctly. As a general rule of thumb, youll want to closely match whats
already in the computer. There are several things to consider:
Capacity What is the capacity of the RAM thats currently in the computer? What is the maximum RAM capacity that the computer is capable
of handling? If your computer can handle a maximum of 128MB of RAM
and youve already got 64MB in the computer you can only add 64 more
megabytes to the computer before its satiated.
Brand If your manufacturer documentation doesnt have any particular
brand in mind for RAM upgrades, be sure that you pick a known reputable
vendor for your RAM. Dont try to short sheet your server by purchasing
from an unknown vendor so you can save a buck or two. Youll likely find
that the RAM doesnt work correctly and that youll have lots of problems
with it.
Speed What is the speed of the RAM, in nanoseconds, thats currently in
the computer? You cannot mix and match RAM speeds. Its vital that you
match the RAM speed currently in the computer with the speed youre
planning on adding.
EDO Extended Data Output (EDO) RAM has the capability of retrieving
the next block of data at the same time as its sending the previous data
block to the CPU. Do not mix and match EDO and non-EDO RAM. You
might experience difficult-to-diagnose erratic activity with the computer
after upgrade.
ECC/Non-ECC Error Correcting Code (ECC) memory has the ability
to check the validity of the data as its passing into and out of the chip. Its
not as vital to make sure you dont mix up ECC with non-ECC memory.
You may want to consider purchasing all ECC memory for your server
and throwing away any non-ECC chips you might encounter.
SDRAM/RDRAM Synchronous Dynamic RAM (SDRAM) has the
capability of running substantially higher clock speeds than older RAM
chips. Newer SDRAM chips can run at a systems 100MHz bus speed,
thus producing significantly faster throughput. But they bog down when
running much faster than 100MHz. Rambus Dynamic RAM
(RDRAM), a RAM chip invented by Rambus, Inc. (www.rambus.com)
can run at phenomenally higher clock speedsa maximum of 600MHz
as of this writing. Thus, as newer system buses come out that are capable
of running at higher clock speeds, RDRAM can keep up with the activity. Another kind of RAM, a competitor to RDRAM being designed by
www.sybex.com
156
Chapter 4
Memory
Oftentimes a computer manufacturers Web site will list the kind of memory
that originally shipped with the computer, thus giving you some documentation that you can utilize when purchasing compatible additions.
www.sybex.com
Summary
157
This isnt any big deal. Just go into the BIOS, verify that the new memory
size has registered, and then exit, saving changes (being careful not to change
any other BIOS options!). The server will restart and this time youll see it
successfully count and pass through power-on without generating any more
errors.
Once the OS has loaded, verify that it sees the correct amount of memory
as well. If you encounter any problems, note any errors that are reports in the
logs. Ive never had a problem with an OS not recognizing the proper
amount of RAM if the BIOS has successfully noticed and registered it.
Summary
www.sybex.com
158
Chapter 4
Memory
Exam Essentials
Know the differences between L1 and L2 processor cache. L1 cache is
actually on the processor. L2 cache is usually part of the mainboard and
is used exclusively for the processor, but it is not part of the processor. L1
and L2 cache are measured in kilobytes.
Know why Raid uses cache. RAID systems use cache to improve
throughput and speed up disk reads and writes. RAID cache is measured
in megabytes.
Know the difference between write-back memory and write-through
memory. Write-through memory writes information to cache and to
main memory at the same time. Write-back memory has the CPU updating the cache during the write, but the actual updating of the main memory is postponed until the line that was changed in memory is discarded
from cache.
Know how memory interleaving works; know how paged memory
works. When your server comes up, the network operating system
(NOS) takes a look at the amount of free space that you have on your disk
subsystem and the NOS then takes part of that free space and creates what
is called a page file. The page file is only for the use of the operating system. The page file is then used to hold information from memory that may
be used again in the near future.
Know the difference between page faults, soft page faults and hard page
faults. A page fault occurs when your system is looking for information
in RAM and cannot find it, so it has to refer to the page file. This is
referred to as a page fault. Page faults then come in two varieties; soft page
faults and hard page faults. Information in memory is stored in frames.
When the information is moved to page files, there has to be a place to
temporarily keep that data, and these are called free frames. The plan is
that these frames will be moved into buffers and then written to disk
before replacement data comes along. If a page fault occurs, and the data
is in one of the free frames that has not actually been written to disk, this
Copyright 2001 SYBEX, Inc., Alameda, CA
www.sybex.com
Summary
159
is called a soft page fault. If the data has already been written to disk, it
is a hard page fault. Soft page faults are handled more quickly than hard
page faults.
Know the difference between ECC memory and EDO memory. ECC is
error-correcting memory. EDO memory just lengthens the amount of
time information can be stored in memory before it is sent to a page file,
or discarded.
Know the difference between unbuffered, buffered and registered memory.
Unbuffered memory writes information directly to the chipset controller.
Buffered memory uses buffered chip to boost the signal and ease the strain
on the system. With registered memory the DIMM contains registers that
will re-drive or enhance the signal as it goes through the memory chip.
Know when to use a hardware compatibility list. Whenever you add
hardware to a server, check the NOS hardware compatibility list. If the
component does not appear on the HCL, check the components manufacturers web site to make sure the appropriate drivers are available.
When in doubt, dont install the device.
RAM upgrade. Know and understand how to upgrade system RAM
and what components to check for when shopping for upgrade RAM.
Key Terms
Before you take the exam, be certain you are familiar with the following terms:
Basic Input/Output System (BIOS)
buffered memory
cache memory
Double Data Rate SDRAM (DDR SDRAM)
Dual Inline Memory Module (DIMM)
dual inline package (DIP)
Dynamic Random Access Memory (DRAM)
error correction code (ECC)
even parity
www.sybex.com
160
Chapter 4
Memory
www.sybex.com
Summary
unbuffered memory
write-back memory
write-through memory
zero wait state
www.sybex.com
161
162
Chapter 4
Memory
Review Questions
1. What does DIP stand for?
A. Dual internal processors
B. Dynamic Induction Processing
C. Dual Inline Package
D. Dynamic Inline Package
2. SIMMs came in which pin configurations?
A. 28-pin
B. 30-pin
C. 64-pin
D. 72-pin
3. With how many data bits were computers working when the 30-pin
www.sybex.com
Review Questions
163
www.sybex.com
164
Chapter 4
Memory
www.sybex.com
Review Questions
165
Two of the slots have 64MB DIMMS in them already. Suzanne wants
to add a 128MB DIMM, giving the system a total of 256MB of total
system memory. When she adds the DIMMS, the power-on self-test
memory count shows the full 256MB but she now gets an error telling
her to adjust the BIOS. What could be the problem?
A. Nothings wrong.
B. Cant pair DIMMS of different capacities.
C. First two DIMMS are ECC DIMMS, new ones not.
D. First two DIMMS are silver-tipped, new ones not.
16. You are going to install interleaved memory. What is the minimum
www.sybex.com
166
Chapter 4
Memory
ECC cannot.
B. ECC is cheaper because the code is actually embedded into a code
detect when there have been 2-bit, 3-bit, or even 4-bit errors.
D. There is no difference.
19. You have a memory module with nine chips on it. What kind of
memory is it?
A. Either parity or non-parity
B. Non-parity only
C. ECC only
D. ECC or parity
20. You have a server that is RAM-starved. You purchase a DIMM from
manufacturer.
B. System requires DIMMs to be installed in pairs.
C. Youve exceeded the systems memory capacity with the DIMM
youre adding.
D. System BIOS needs to be adjusted.
www.sybex.com
167
groups of 4.
5. A. Each 72-pin SIMM supported 32 data bits.
6. A. With 72-pin SIMMs, just one module would provide the necessary
32 data bits.
7. B. The two banks were referred to as Bank 0 and Bank 1.
8. D. DIMM stands for Dual Inline Memory Module.
9. C. With a DIMM, the opposing pins remain separate and isolated to
www.sybex.com
168
Chapter 4
Memory
has 70ns DIMMs currently installed, she could run into trouble.
Also, if the server shes trying to upgrade has proprietary memory in
it, she could create some problems by not buying manufacturerrecommended DIMMs for the system. Additionally, its not a wise
idea to match ECC with non-ECC memory and so forth. Generally
its a good idea to ascertain whats currently in the system and match
accordingly. The kind of contacts each DIMM has shouldnt affect
the systems operation.
15. A. In almost all cases, after you add memory to a system, you have to
go into the system BIOS and acknowledge that the current memory
count is correct.
16. B. Interleaved memory combines two banks of memory into one, and
the system to a halt. It can also detect when there have been 2-bit, 3-bit,
or even 4-bit errors, which makes it a very powerful detection tool.
19. D. Where the number of chips is divisible by 3, it can be either ECC or
parity memory.
20. A, B. First of all, you should always consult the manufacturers guide-
www.sybex.com
Chapter
System Bus
Architecture
SERVER+ EXAM OBJECTIVES COVERED IN
THIS CHAPTER:
3.7 Upgrade peripheral devices, internal and external
www.sybex.com
h, yeah, now we are a getting into it. If you take the case
off a server, one of the first things you are going to notice is that big green
board that everything else plugs into. Call it a mainboard, call it a motherboard, call it whatever you want, that is where all the information must
pass to go anywhere. Everything else we have been talking about wont
work at all if there is something wrong with the motherboard. In this chapter we are going to look at what sets this component apart and how all the
various parts come together to communicate.
If you have taken the A+ exam, some of the information we are going to
cover will probably be review, but that is not a bad thing. Review can usually
help us all! We are going to start this section by talking about bus basics and
then move into the way the Peripheral Component Interconnect (PCI) local
bus works. At that point, we will cover most of those objectives listed above.
All of these topics relate to information flow and the speed with which it
moves through the server and out to the network.
Bus Basics
You may have noticed that in Chapter 4, Memory, there were several references to the bus speed that went without explanation. My reasoning was
that I would use that as a promo for the good stuff in Chapter 5. So, what is
a bus? A bus is a set of signal pathways that allow information to travel
between the components that make up your computer. These components can
www.sybex.com
171
This discussion is going to make use of some terms that were defined and discussed in earlier chapters: DMA and Bus Mastering. Because these are important
concepts, we will take a couple of sentences to review them here. A Direct Memory Access channel is a channel that a peripheral device can use to write specifically to a set memory address. DMA channels cannot be shared, and the device
does not use the CPU to access memory. Bus Mastering is similar. It is the ability
of a device to perform its function without needing to access the CPU. It writes
information to memory without accessing the CPU.
Interrupts
Interrupts are amazing things. When you have installed a component properly,
it is up to the interrupt to get the attention of the CPU when the component has
information or data to send. If the card is not installed properly, and you have
chosen an interrupt that is already being used, the card will either not function
or the system will completely lock up. Fortunately, with the PCI Bus, each
expansion slot (rather than the card) is assigned an interrupt, so the problem
of misconfigured components has been minimized.
When a card or peripheral has some data to send, it uses something called
an interrupt requestor (IRQ) line. The IRQ is kind of like a student in class
holding up her hand to get the attention of the instructor. In this case,
though, the peripheral is trying to get the attention of the CPU.
Each type of bus has several different types of IRQs, and some of these
IRQs are reserved. For example, IRQ 0 and IRQ 1 are used by the processor
for special processor stuff. The other IRQs can be allocated depending on the
peripherals that are installed.
www.sybex.com
172
Chapter 5
IRQs are finite, meaning there are only a few that can be used. If your server
has several different peripherals that are not PCI, you could conceivably run
out of IRQs.
Expansion Slots
If you were to look closely at an expansion slot, you would see that it is made up
of several tiny copper finger slots. Each finger slot has a row of very small channels that make contact with the fingers on the expansion circuit board. These
finger slots are then connected to the pathways on the motherboard, and each of
the pathways has a specific function. One of the pathways provides the power
necessary to run the expansion card. Another set of pathways is the data bus,
which, as the name implies, transmits data to and from the processor. Another
set of pathways makes up the address bus. The address bus, you will remember,
allows the device to be addressed, or contacted, by the CPU, using a set of Input/
Output (I/O) addresses. There are also pathways for things like interrupts, direct
memory access (DMA) channels, and clock signals.
It is really pretty easy to tell what type of expansion bus you are using, just
by looking at the motherboard. As you have probably figured out by now,
I am big time into the history of computing, and it is never a bad thing to
know where the industry has come from. Some of these you may never see,
unless you go through a hardware museum, but it never hurts.
ISA 8-Bit Bus
Back in the early days, the expansion bus was only 8 bits wide and had a
blazing speed of 4.77 MHz. There were just eight interrupts (we will talk
about those in just a few pages) and just four DMA channels. By todays
www.sybex.com
173
standards, this is slower than horse and buggy, but for the day it was blazing
fast. Since this was the very first PC Bus, and since it was designed by IBM,
the original makers of the IBM PC, they referred to this architecture as
Industry Standard Architecture (ISA). The 8-bit bus connectors are shown
in Figure 5.1, and an 8-bit bus expansion card is shown in Figure 5.2.
FIGURE 5.1
If you look carefully at the picture above, notice how wide those finger slots
were. We will be able to compare those with newer technology in just a second. Here is the type of card that took advantage of that slot (see Figure 5.2).
FIGURE 5.2
www.sybex.com
174
Chapter 5
Motherboard
So, what you have here is the old 8-bit slot with an add-on. The ISA Bus
also helped expansion by adding eight more interrupts and four more DMA
channels. It was quite easy to spot the kind of board that fit these new ISA
slotsthey looked like Figure 5.4.
www.sybex.com
FIGURE 5.4
175
16-bit
connector
8-bit connector
If you look closely at the card, you will see that toward the front of the card
there is an 8-bit connector, separated from the read connector by a slot. This
architecture was really interesting because of compatibility. For example, if the
expansion card was an 8-bit card, it would run in either an 8-bit or a 16-bit
slot. It you had a 16-bit card, it would naturally run in the 16-bit slot for which
it was designed, but it would also run (albeit a lot slower) in an 8-bit slot. So,
pretty much everything was compatible with everything else.
Micro Channel Architecture (MCA)
About this time in the history of computing, the company that invented the
PC, IBM, was beginning to think the world was passing them by. Their share
of the market was steadily declining and they figured that they had to do
something to get it back. That something was the Personal System/2 (PS/2).
Along with the PS/2, IBM was introducing a new type of data bus called
Micro Channel Architecture (MCA). This bus was supposed to put the ISA
Bus out of business by utilizing a smaller connector with thinner fingers.
MCA was revolutionary because it was available in either 16-bit or 32-bit
versions. Secondly, it could have several Bus Mastering devices installed, and
the bus clock speed was about 25% faster than the old 8 MHz systems,
screaming along at 10 MHz. The really revolutionary part of the puzzle was
the way that you configured the expansion cards. In all the other bus technologies, the cards were configured by jumper settings, or by DIP switches.
With MCA, device configuration was done with software. These were the
first software-configurable expansion cards.
www.sybex.com
176
Chapter 5
This was an interesting concept, but it had some problems. First of all, all
device configurations were done from a single boot diskette that contained all
the information files for all the devices. When you made a change, the change
was not only written to the card, it was also written to this diskette. That diskette was the only diskette that knew what was in the system and how each
device was configured. At the time, I was working doing onsite hardware support. Whenever I ran into a PS/2 device (and some of them were servers) I
knew there was going to be trouble. I would ask for the configuration diskette,
and usually receive a blank look from the customer. It was then up to me to
find a PS/2 diskette and configure the entire system from scratch. Great idea,
but they forgot to take it that one extra step of either saving the configuration
files to a disk or making the devices able to provide configuration information
if asked by a setup program.
You could always tell an MCA card: They dont call IBM Big Blue for
nothing! Look at Figure 5.5.
FIGURE 5.5
As with most things that came out of IBM at the time, the MCA architecture was very proprietary. At a time when the buzzword was compatible,
IBM wasnt. In addition, IBM charged vendors who developed their own
expansion cards 5% of their gross receipts. Even way back then, margins
were slim on computer hardware, and this put the cost of MCA peripherals
out of sight.
www.sybex.com
177
EISA
Back in the late 80s to early 90s it was still a computer war out there. IBM was
selling PCs because they were IBM. The catch phrase at the time was, You
can never get fired for buying IBM. But there was competition, led by the
Gang of Nine. The Gang of Nine was made up of nine computer manufacturers that thought there had to be a better way than MCA to get faster speeds.
The Gang of Nine consisted of some of the top names in the industry at the
time: AST, Compaq, Epson, Hewlett-Packard, NEC, Olivetti, Tandy, Wyse,
and Zenith. They began to offer an alternative to MCA called Extended Industry Standard Architecture (EISA). For a while, EISA was popular in both 386
and 486 computers until about 1993 when PCI came along.
EISA had many of the same things going for it that MCA had, but it also
had compatibility with the older ISA board. Take a look at Figure 5.6 and
Figure 5.7.
FIGURE 5.6
Now, one of the things you do not see in that picture is how deep the connector slots really were. They were about twice as deep as the old ISA slots
and 8-bits slots. Compatibility was done by staggering the finger slots. Look
closely at Figure 5.7 and you will see that some of the grooves are longer than
others.
www.sybex.com
178
Chapter 5
FIGURE 5.7
With this type of setup, if you were installing an 8-bit card, it would only
go so deep into the expansion slot. A 16-bit card would go as deep, but use
the back connector. An EISA card on the other hand, would slip all the way
to the bottom of the connector, making a 32-bit data path.
EISA Configuration
Are you familiar with Plug-and-Play hardware? Well, EISA was a precursor
to Plug and Play, and at the time, it was certainly a lot easier than other forms
of hardware installation.
Lets say that you were installing a new network card in an ISA-based
machine. Before you installed the network card, you had to check the computer to find out (at the very least) what interrupts were being used by other
devices. Then you configured the network card to use an interrupt that was
not being used by any other device, installed it, turned the computer on, and
ran the appropriate driver for the card. If you did your job right, the driver
would load and you had network connectivity. What usually happened, on
the other hand, was that you (okay, read that I) had guessed wrong and the
IRQ was already in use. This necessitated starting all over. Things changed
with EISA.
With an EISA Bus, on the other hand, you would take the top off the computer and install the EISA card in an EISA slot. The toughest part of the process was remembering what slot you installed it into. Anyway, after the card
www.sybex.com
179
was seated, you turned the computer on, and as part of the Power On Self
Test (POST) the computer would figure out that there was something new,
different, and interesting going on inside. The computer would ask you to
configure the device, and you would use a program called EISA Configuration (EISA Config for short) to set the IRQ, DMA, and anything else you
needed to set. All this was done via the slot number, and the information was
then saved on the card. This made a technicians life remarkably easy,
because the EISA Config utility would even go out and check to find out
what settings were already being used. That way, you almost couldnt mess
it up.
The difference between configuring a machine with MCA and EISA was
that with MCA, you needed the diskette with the configuration utility for
that specific computer. Without the specific configuration disk, you reconfigured the whole machine. With EISA, you needed an EISA configuration
utility for that brand of computer. Sometimes, EISA configuration utilities
would even work across brands. So, if you carried around a diskette with the
Compaq EISA Config on it, you could configure all Compaq EISA machines.
There were other enhancements of EISA over ISA:
The CPU, DMA, and Bus Mastering devices could make use of a 32bit memory-addressing scheme.
The data transfer protocol that was used for high-speed burst transfers
was synchronous.
EISA had better ways of handling DMA arbitration and transfer rates.
EISA finally gave way to PCI. That is what the majority of systems on
the market are using today. Lets take a closer look at the new industry
standard bus.
www.sybex.com
180
Chapter 5
ow that we have had a pleasant walk through memory lane, lets get
us closer to the present. When Intel released the Pentium processor, all of the
existing buses became instantly obsolete. Every bus up until this moment had
been of the 16-bit or 32-bit variety, and then along came the Pentium, which
was a 64-bit processor. Using a Pentium processor with a 16-bit or 32-bit
bus would be like pulling the engine out of a Ferrari and replacing it with
something from a Yugo. It just shouldnt be done and performance would
suffer greatly.
Peripheral Component Interconnect (PCI) works well with the current iteration of the processor. It can handle both a 64-bit and a 32-bit data path. It is
also processor independent, which means that it uses a form of Bus Mastering.
www.sybex.com
181
Back in the early days of PCs it was up to the microprocessor in the computer to manage every byte that was moved along the data bus. It was up to
the microprocessor to read the byte from one device or from memory, decide
where that byte belonged, and then write the byte to the proper location. Soon
it became obvious that this was a whole lot of work that could be farmed out
to other devices. The microprocessor, for example, did not need to be handling
everything that went into and out of the expansion bus. After all, the microprocessor is supposed to be the manager of the operation, and all really good
managers know how to delegate responsibilities. Bus mastering is the result of
that delegation.
With Bus Mastering, the microprocessor does not have to be involved in
every transaction. It can delegate control to special circuits called bus controllers, and these bus controllers will direct traffic between different circuits. The
actual device that takes full control of the expansion bus is called a Bus Master.
The device that will end up receiving the data from the Bus Master is called the
bus slave. Some of the more politically correct systems may call the master and
the slave the initiator and the target.
So, the bus controller can manage multiple Bus Masters, and Bus Masters
take control of the actual expansion bus through a process called bus arbitration. Each type of bus has a protocol that is used for managing this arbitration
process. That protocol can be based in hardware or software, though it is usually
hardware-based.
PCI Bridges
Bus Mastering makes it sound like there is just one bus, and that is not even
close to the truth. The average PC has several buses and these are usually
operating at different widths and at different speeds. It is kind of a system
board designers hell. Somehow there has got to be a way to hook up all
those different types of buses together and get them to work in a cohesive
way. This really took the forefront when PCI was introduced, because
remember, PCI was designed to be processor independent.
The problem was solved with something called the PCI Bridge. Think
about what a bridge does in your world. It moves things from one location
to another over some kind of obstacle. That is just what a PCI Bridge does,
but it does it with data. The PCI Bridge moves that data from one system bus
to another system bus, and it is up to the bridge to handle all the gory details
of the transfer. This can include things like changing the data format and
protocols without making use of any outside hardware and software products. The bridge can be some form of standalone hardware, or it may just be
part of the chipset that makes up the PCs mainboard.
www.sybex.com
182
Chapter 5
PCI Bridges are really busy. In a typical system, for example, the bridge
can take moving information from the microprocessor bus to the high-speed
PC Bus and even to an old, outdated ISA compatibility Bus. PCI Bridges can
even link to other PCI Bridges to form a PCI-to-PCI Bridge, or a PPB.
How far can this go? PCI Bridges can be connected to other PCI Bridges up to
a maximum of 256 PCI Buses in a single PC. We will cover more on this when
we talk about Hierarchical Buses and Peer Buses.
www.sybex.com
183
Processors
Host-to-PCI
Bridge
133MB/sec
FIGURE 5.8
PCI Bus
Memory
Slots
PCI-to-PCI
Bridge
www.sybex.com
Slots
184
Chapter 5
You will notice that there is only one data path to get to the host bus.
Everything has to go through the PCI-to-PCI Bridge to reach the primary bus
and then through the Host-to-PCI Bridge. While this method does provide
for a great number of devices, there is no load balancing capability.
Lets see what it is like with the Peer PCI Bus.
Host Bus
540MB/sec
Processors
Slots
Bridge
Hostto-PCI
Bridge
133MB/sec
Memory
133MB/sec
Slots
PCI-to-EISA
Bridge
33MB/sec
Slots
EISA Bus
You will notice that, in this case, the two PCI Buses are linked independently to the processor bus using two Host-to-PCI Bridges. Since there are
two independent buses, there can be two Bus Masters transferring data at the
same time, giving more overall throughput and a higher bandwidth. This is
especially useful if you have a server with two or more peripherals that are
bandwidth intensive. If you split the peripherals between the two buses, you
are in effect creating load balancing.
www.sybex.com
185
If you are using a server that makes use of the Peer PCI Bus, there has to
be configuration of the Input/Output (I/O) subsystems. The load balancing
configuration should be taken into account even before the initial system
setup and configuration takes place.
This bus balancing is accomplished by actually balancing the I/O bandwidth for each bus. This should produce the optimal performance on a system. This will work great with Peer PCI Buses, but it may not work as well
with a bridged PCI system. Here are some recommendations on when to do
load balancing.
If you are using a bridged architecture, load balancing is not recommended. With a bridged architecture make sure the primary bus is the
first one that is populated.
If your Dual Peer architecture also makes use of PCI Hot Plug slots,
there is going to be some tradeoff between high availability and high
throughput.
Bus Balancing
Here are some guidelines on how to balance a PCI load:
If you have several network or array controllers, make sure they are
split between the buses.
Avoid putting two network cards on the same bus, unless both buses
already have a network card installed. It is better to have a system that
has a dual-port network card on each bus, rather than to have two
individual network cards on each bus.
So, how does the processor bus know when the network cards need attention? PCI Buses do use interruptsthey just use them a little differently.
www.sybex.com
186
Chapter 5
PCI Interrupts
PCI is a self-contained expansion bus. The interrupts that would normally be
set at the card level are managed at the expansion-slot level by the software
that drives the peripheral devices. With PCI, there are four level-sensitive
interrupts that have interrupt sharing enabled, and these can amount to up
to 16 separate interrupts when examined as a binary value. The PCI specification does not define what the actual interrupts are for each slot or even
how they are to be shared. All of that design relationship is left up to the person who is designing the expansion device. That means that these details are
usually not handled at the hardware level, as was the case in the earlier architectures. With PCI devices, the software device driver for the board handles
the interrupt configuration. These interrupts are really independent in a way,
because they are not synchronized with any of the other bus signals and so
they can be activated at any time.
www.sybex.com
187
As you can see, the finger slots in the bus are very small, and packed very
closely together. These expansion slots are usually white, and they are
divided into two sections.
There are two different kinds of PCI expansion slots and the voltage the
slots use differentiates the versions. One of the types uses +5 volts DC to
power the expansion card, while the lower-voltage model uses 3.3 volts.
When you look at the connector for the buses, the only differences are the
positioning of the blocker in each connector. This blocker, or key, keeps the
3.3-volt card from being plugged into a 5-volt slot.
Now, you have been wondering why I spent all that time covering some
of that other stuff on PCI Bridges and what have you. Well, when we talk
about expansion slots, one of the first questions that comes to mind is how
many can you have. The answer to that is, It depends. As you start stuffing
more and more stuff in a smaller and smaller space, something has to give,
and it is usually the electrical effects inside any given system. Because PCI
operates at a high bus speed, it is especially susceptible to high frequencies,
radiation, and other forms of electrical interference. The current standards
for a local bus limits to three the number of high-speed devices that can be
connected to a single bus.
If you paid close attention there, you notice that the standard calls for just
three devices, not three slots. Most local bus systems now have their video
display built into the motherboard. That circuit counts as a local bus device,
so, if your PC has video on the motherboard, you can use two local bus
expansion slots.
The limit of three devices comes from speed considerations. The bigger or
larger the bus, the more connectors there are. More connectors means that
any signal placed on a circuit will degrade more quickly, and the only way
to beat the degradation is to start with more signals. Somewhere, someone
had to draw the line, and the line was drawn at three devices.
While it seems that three devices may be limiting, it is not. Remember our
discussion of PCI Bridges? Well, since the three-device limit is per expansion
bus, the PCI Bridge allows multiple expansion buses to be linked together.
Each of these will use its own bus control circuitry. While this may sound
complicated, it is one of those things that doesnt really make any difference.
After all, as long as it works, that is all that counts, and the design is all in
the chipset.
Is there a way you can use this technology not only to increase the performance, but also to increase the availability?
www.sybex.com
188
Chapter 5
www.sybex.com
189
When you are talking about hot swapping PCI devices, there are two sets of
specifications, the Hot Swap PCI Specifications and the CompactPCI Hot Swap
Specifications that are managed by the PCI Industrial Computer Manufacturers Group (PICMG). The two standards are very similar and they differ in only
a couple of areas. For example, with the PICMG Specifications, the backplane
that the device plugs into is passive, and all the logic is contained on the
adapter card. This same logic is used to power up the adapter card.
Making it easier for the developers is the fact that the devices are controlled
by software. It is up to the system software to provide the smarts for this whole
process to work.
www.sybex.com
190
Chapter 5
FIGURE 5.11
Application
Hot plug
service
OS Calls Service
OS
PCI hardware
Within the general use and the specific use categories are three levels that
define how the live-insertion capability is carried out. These levels are Basic
Hot Swap, Full Hot Swap, and high availability.
Basic Hot Swap The end user must tell the operating system that a card
is going to be inserted or removed. This is usually done from the system
console.
Full Hot Swap This category adds to the functionality of Basic Hot
Swap. In this case, there is a microswitch added to the cards injector/ejector mechanism. This way, the technician does not have to tell the operating system that the change is about to occur. When the card is installed or
removed, the switch changes the electrical configuration and gives the OS
a warning that the process is about to occur.
High availability This level provides the greatest functionality for reconfiguring software while the system is running. This allows for on-the-fly
reconfiguration of both the hardware device and the software components.
In this case, the operating system itself can sense when a card has failed, and
the OS will bring a previously installed replacement card online to assume
the duties of the failed device.
Copyright 2001 SYBEX, Inc., Alameda, CA
www.sybex.com
191
The ability to choose to isolate a card from the logic on the system
board.
With Hot Plug PCI, the user cannot remove or install a PCI card without
first telling the software. Once the software has been notified, it performs the
steps necessary to shut down the card connector so the card can be removed
or installed. It is up to the operating system to visually let the end user know
when it is all right to install or remove the card.
The advantage of Hot Plug is that you can use any PCI card in the system.
Changes are needed to the chipset, the system board, the operating system,
and the drivers.
www.sybex.com
192
Chapter 5
It improves the throughput of the server, because the I/O of the peripheral
is removed from the CPU.
Can increase fault isolation and recovery, which works to provide for
higher availability.
AGP
In each of the previous sections we have been talking about throughput,
especially when it comes to network interface cards and disk controllers. In
each of these sections we stressed how important it was to off-load the mundane tasks from the processor, in effect, giving it more time for the serious
processor tasks.
This section is going to take a somewhat different tack, concentrating on
video. Now, this is not necessarily a topic I normally associate with servers,
because usually servers are servers and high-definition graphics is not all that
important.
AGP is short for Accelerated Graphics Port. It is an interface based on
PCI and designed for the throughput demands of high demand video like
3-D graphics. Rather than using PCI for graphics data, AGP has a dedicated point-to-point channel to directly access main memory. AGP runs at
66 MHz over a data channel that is 32 bits wide, providing bandwidth of
266 MBytes/second. This compares to the PCI bandwidth of 133 MBytes/
second. In addition, there are two optional, faster video modes, providing
throughputs of 533 MBytes/second and 1.07 GBytes/second. AGP can support this kind of throughput by storing some of the 3-D textures in main
memory rather than in video memory.
Why would you install AGP on a server? Well, if there are 3-D applications that you have to run on the server, AGP will help off-load some of the
work that is placed on the CPU. If there is a 3-D application running, the
CPU (without an AGP graphics controller) is responsible for performing all
www.sybex.com
193
those intensive 3-D calculations. The graphics controller can process the texture data and the bitmaps. At this point, the controller has to read information from several different textures and then average the information into a
single pixel on the screen. While this calculation is being performed, the pixel
is stored in the memory buffer. Since the textures are very large, there isnt
room in the video cards buffer. AGP overcomes this shortcoming by storing
the image in main system memory.
When AGP wants to access the texture data, it uses a process called Direct
Memory Execute (DIME). DIME connects the systems main memory to the
AGP/PCI chipset.
So, should you be looking for an AGP controller in your server? If your
server is going to be physically running some 3-D applications, it may be
something you want to look at. However, you should know that several published studies question whether there really is a performance increase over
using just a PCI video card. If your system is going to be making use of AGP,
you should definitely add more memory to the server to provide the extra
memory that the video subsystem needs.
www.sybex.com
194
Chapter 5
All the network administrators that I know really hate it when they
hear the complaint, Gosh, the network is slow today!
In this chapter, I have laid out several different technologies that can help
you provide both performance and high-availability solutions. As you saw at
the beginning, the early motherboards bus technology was speed limiting,
not only in processing power but also in moving the information from the
processor back to the user who requested it. PCI helped to change that.
Each of the technologies that we have talked about has stressed the same
philosophy: Take the mundane calculations away from the processor and let
something else handle it. That way, the processor is freed up to do other
things. This, in turn, speeds up performance.
When you design your server, pay close attention to the types of subsystems
that are present, and be sure to take full advantage of them. Also understand
that for each of the performance enhancing technologies that you opt to have
in your server, there are going to be trade-offs. Usually that trade-off will come
in the form of how large a check you will have to write to pay for the server.
www.sybex.com
195
Realize too that you can often get a few extra miles out of an older peripheral by simply upgrading its system BIOS. This may not be possible with all
devices, but many of them have the ability to have their firmware updated to
make them compatible with newer operating systems.
www.sybex.com
196
Chapter 5
can use for your device. In situations where youre not sure about the IRQs,
figure out what IRQs are in use first, then youll know whats available for
the new device.
Direct Memory Access Direct Memory Access (DMA) provides a way
that data can be transferred from a device to system memory or vice-versa
without having to go through the CPU, thus freeing up CPU cycles. You
set up a DMA channel for the data to go through. DMA isnt heavily used,
but it should be used more than it is. When purchasing new peripheral
gear, check the products documentation to see if it can use DMA, then
decide which DMA channel youd like to set up for the device.
Cabling Cabling is a huge issue for external SCSI devices. Youll have to
look at the back of the computer to determine what type of connection the
internal SCSI adapters external port has. Next you determine what kind
of SCSI connection the new device is expecting. Finally you purchase a
cable that matches the configuration. For example, suppose that youre
going to purchase an Ultra-SCSI device but plug it into a SCSI II external
port on the computers SCSI adapter. Youll need a SCSI II-to-Ultra-SCSI
cable. Youll want to make sure which side is male and which is female as
well, before you go looking for the cable. You can buy adapters that fit
onto a SCSI cable to make the cable work with different SCSI versions. I
think youre setting yourself up for data transfer problems if you purchase
an adapter because it could work loose and cause you some problems that
may be hard to diagnose.
Note that you might want to go into the SCSI adapters BIOS to tweak it so
it works with the new device. Check your SCSI adapters documentation
for more information on adjusting BIOS settings.
Power Some peripherals require a power socket and are separately powered from the computer. Be aware of this before you buy so that if youre
lacking enough power sockets where you want to place this peripheral, you
can get the electrical work done before the peripheral comes in. For example, some backup tape devices require a substantial power supply and youll
have to address the power needs before the gear can be put into production.
When your new gear comes in, read its documentation thoroughly and be
sure you understand how to install and configure the device. Lots of times
its easy to get in a hurry and think that you dont need to bother with reading the documentationbut its always worth your while to be sure you read
and understand how the device is supposed to interplay with your system.
www.sybex.com
Summary
197
Summary
Exam Essentials
Know the basics of PCI bus mastering PCI Bus Mastering is a way for
the motherboard bus to improve performance by directing signals directly
to the components. This is one way of making sure the CPU is involved in
only those transactions that it really has to act on. If the workload of the
CPU is eased, your server should experience better performance.
Know the basics of PCI hot swap or PCI hot plug PCI Hot Swap
means that you can remove a bad component and replace it without shutting off the server. PCI Hot Plug means that you can add a component
without taking the server out of service.
www.sybex.com
198
Chapter 5
Know the basics of a hierarchical and peer PCI bus With a hierarchical
PCI Bus, the buses in the hierarchy operate concurrently. That means that
a PCI Master and a PCI target on the same PCI Bus can communicate even
if the other PCI is busy. With a peer to peer PCI Bus there are two independent buses. This means that there can be two Bus Masters transferring
data at the same time, giving more overall throughput and a higher bandwidth. This is especially useful if you have a server with two or more
peripherals that are bandwidth intensive. If you split the peripherals
between the two buses, you are in effect creating Load Balancing.
Know what interrupts are and how the system uses them Interrupts
(IRQs) are the way components get the attention of the CPU.
Know that EISA is a form of system bus; know how the architecture of the
system bus can affect server performance The architecture of the system
bus will determine how much information can flow to various components
at any given time. The faster the bus, with the appropriate components, the
better the performance should be.
Be able to upgrade a variety of devices Know and understand the
complexities and nuances of installing upgraded peripheral devices.
Key Terms
Before you take the exam, be certain you are familiar with the following terms:
Accelerated Graphics Port
address bus
Basic Hot Swap
blocker
bus
bus arbitration
bus controllers
Bus Master
Bus Mastering
bus slave
clock signals
www.sybex.com
Summary
data bus
direct memory access (DMA
Direct Memory Execute (DIME)
EISA Configuration
expansion slot
Extended Industry Standard Architecture (EISA)
external bus
Full Hot Swap
Grant (GNT#)
Hardware Device Module (HDM)
Hierarchical PCI Bus
Host-to-PCI Bridge
Hot Plug PCI
Hot Swap
I2O Messaging Layer
Industry Standard Architecture (ISA)
Input/Output (I/O)
Intelligent Input/Output (I2O)
interrupt requestor (IRQ)
interrupts
load balancing
Micro Channel Architecture (MCA)
OS Services Module (OSM)
PCI Bridge
PCI Hot Plug
PCI-to-PCI Bridge
Peer PCI Bus
www.sybex.com
199
200
Chapter 5
www.sybex.com
Review Questions
201
Review Questions
1. What was the first computer bus referred to as?
A. EISA
B. MCA
C. ISA
D. I2O
2. EISA was referred to as which of the following?
A. An 8-bit bus
B. A 12-bit bus
C. A 16-bit bus
D. A 24-bit bus
E. A 32-bit bus
3. If you want to do PCI load balancing, what will you need to have?
A. A Peer Bus
B. A Hierarchical Bus
C. Hot Swap devices
D. Hot Plug devices
4. Which version of Hot Swap PCI requires users to notify the operating
system that they are about to take a device out of the system?
A. Basic Hot Swap
B. Full Hot Swap
C. High availability
D. All of the above
www.sybex.com
202
Chapter 5
interface cards and a single drive array controller, how would you plan
to install them in a Dual Peer Bus configuration?
A. Put both of the NICs on one bus and put the drive array controller
on the other.
B. Put both NICs on the master PCI Bus and the drive array controller
to either one.
D. There cannot be more than one NIC in any server.
7. What is Bus Mastering?
A. All transactions are sent directly to the processor.
B. All transactions are sent directly to memory.
C. All transactions directed to the disk array controller are directed to
www.sybex.com
Review Questions
203
DLT tape changer. Both devices are SCSI. Now the computer wont
boot to the NOS and Johann is getting a SCSI IRQ conflict error even
though he verified that hes using the same IRQ as the old backup
device. What could be the problem?
A. New device isnt terminated.
B. Device is trying to use six IRQs.
C. PCI bus is autodetecting the wrong IRQ.
D. New devices BIOS hasnt been enabled.
www.sybex.com
204
Chapter 5
www.sybex.com
Review Questions
205