100% found this document useful (3 votes)
4K views681 pages

Server+ Study Guide

For CompTIA Server+ Certification

Uploaded by

Alberto Teca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (3 votes)
4K views681 pages

Server+ Study Guide

For CompTIA Server+ Certification

Uploaded by

Alberto Teca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 681

Server +

Study Guide

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Server +TM
Study Guide

Gary Govanus
with William Heldman
Jarret Buse

San Francisco Paris Dsseldorf Soest London


Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Associate Publisher: Neil Edde


Contracts and Licensing Manager: Kristine OCallaghan
Acquisitions and Developmental Editor: Elizabeth Hurley
Editor: Linda Stephenson
Production Editors: Judith Hibbard, Shannon Murphy
Technical Editors: Scott Warmbrand, Donald Fuller
Contributor: Rebecca Monson
Book Designer: Bill Gibson
Graphic Illustrator: Tony Jonick
Electronic Publishing Specialist: Susie Hendrickson
Proofreaders: Jennifer Campbell, Nanette Duffy, Amey Garber, Leslie E. H. Light, Yariv Rabinovitch
Indexer: Nancy Guenther
CD Coordinator: Erica Yee
CD Technician: Kevin Ly
Cover Designer: Archer Design
Cover Photographer: Tony Stone Images
Copyright 2001 SYBEX Inc., 1151 Marina Village Parkway, Alameda, CA 94501. World rights reserved. No part of this
publication may be stored in a retrieval system, transmitted, or reproduced in any way, including but not limited to photocopy,
photograph, magnetic, or other record, without the prior agreement and written permission of the publisher.
Library of Congress Card Number: 00-109136
ISBN: 0-7821-2893-9
SYBEX and the SYBEX logo are either registered trademarks or trademarks of SYBEX Inc. in the United States and/or other
countries.
Screen reproductions produced with FullShot 99. FullShot 99 1991-1999 Inbit Incorporated. All rights reserved.
FullShot is a trademark of Inbit Incorporated.
The CD interface was created using Macromedia Director, COPYRIGHT 1994, 1997-1999 Macromedia Inc. For more
information on Macromedia and Macromedia Director, visit http://www.macromedia.com.
Internet screen shots using Microsoft Internet Explorer 5.5 reprinted by permission from Microsoft Corporation.
Sybex is an independent entity from CompTIA, and not affiliated with CompTIA in any manner. Neither CompTIA nor
Sybex warrants that use of this publication will ensure passing the relevant exam. Server+ is either a registered trademark
or trademark of CompTIA in the United States and/or other countries.
TRADEMARKS: SYBEX has attempted throughout this book to distinguish proprietary trademarks from descriptive terms
by following the capitalization style used by the manufacturer.
The author and publisher have made their best efforts to prepare this book, and the content is based upon final release
software whenever possible. Portions of the manuscript may be based upon pre-release versions supplied by software
manufacturer(s). The author and the publisher make no representation or warranties of any kind with regard to the completeness or accuracy of the contents herein and accept no liability of any kind including but not limited to performance,
merchantability, fitness for any particular purpose, or any losses or damages of any kind caused or alleged to be caused
directly or indirectly from this book.
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

To Our Valued Readers:


Sybex is proud to serve as a cornerstone member of the Server+ Advisory Committee. Just as CompTIA
is committed to establishing measurable standards for certifying individuals who will support computer
and networking systems in the future, Sybex is committed to providing those individuals with the skills
needed to meet those standards. By working alongside CompTIA, and in conjunction with other
esteemed members of the Server+ committee, it is our desire to help bridge the knowledge and skills gap
that currently confronts the IT industry.
Sybex expects the Server+ program to be well received, both by companies seeking qualified technical
staff and by the IT training community. Along with the existing line of vendor-neutral certifications
from CompTIA, including A+, Network+, and I-Net+, the Server+ certification should prove to be an
invaluable asset in the years ahead.
Our authors and editors have worked hard to ensure that the Server+ Study Guide is comprehensive, indepth, and pedagogically sound. Were confident that this book will meet and exceed the demanding
standards of the certification marketplace and help you, the Server+ exam candidate, succeed in your
endeavors.
Good luck in pursuit of your Server+ certification!

Neil Edde
Associate PublisherCertification
Sybex, Inc.

SYBEX Inc. 1151 Marina Village Parkway, Alameda, CA 94501


Tel: 510/523-8233
Fax: 510/523-2373 HTTP://www.sybex.com

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Software License Agreement: Terms and Conditions


The media and/or any online materials accompanying this book
that are available now or in the future contain programs and/or
text files (the "Software") to be used in connection with the book.
SYBEX hereby grants to you a license to use the Software, subject
to the terms that follow. Your purchase, acceptance, or use of the
Software will constitute your acceptance of such terms.
The Software compilation is the property of SYBEX unless otherwise indicated and is protected by copyright to SYBEX or other
copyright owner(s) as indicated in the media files (the
"Owner(s)"). You are hereby granted a single-user license to use
the Software for your personal, noncommercial use only. You
may not reproduce, sell, distribute, publish, circulate, or commercially exploit the Software, or any portion thereof, without the
written consent of SYBEX and the specific copyright owner(s) of
any component software included on this media.
In the event that the Software or components include specific
license requirements or end-user agreements, statements of condition, disclaimers, limitations or warranties ("End-User
License"), those End-User Licenses supersede the terms and conditions herein as to that particular Software component. Your
purchase, acceptance, or use of the Software will constitute your
acceptance of such End-User Licenses.
By purchase, use or acceptance of the Software you further agree
to comply with all export laws and regulations of the United
States as such laws and regulations may exist from time to time.
Software Support
Components of the supplemental Software and any offers
associated with them may be supported by the specific
Owner(s) of that material but they are not supported by
SYBEX. Information regarding any available support may
be obtained from the Owner(s) using the information provided in the appropriate read.me files or listed elsewhere on
the media.
Should the manufacturer(s) or other Owner(s) cease to offer support or decline to honor any offer, SYBEX bears no responsibility. This notice concerning support for the Software is provided
for your information only. SYBEX is not the agent or principal of
the Owner(s), and SYBEX is in no way responsible for providing
any support for the Software, nor is it liable or responsible for any
support provided, or not provided, by the Owner(s).
Warranty
SYBEX warrants the enclosed media to be free of physical defects
for a period of ninety (90) days after purchase. The Software is
not available from SYBEX in any other form or media than that
enclosed herein or posted to www.sybex.com. If you discover a
defect in the media during this warranty period, you may obtain
a replacement of identical format at no charge by sending the
defective media, postage prepaid, with proof of purchase to:

SYBEX Inc.
Customer Service Department
1151 Marina Village Parkway
Alameda, CA 94501
(510) 523-8233
Fax: (510) 523-2373
e-mail: [email protected]
WEB: HTTP://WWW.SYBEX.COM
After the 90-day period, you can obtain replacement
media of identical format by sending us the defective disk,
proof of purchase, and a check or money order for $10,
payable to SYBEX.
Disclaimer
SYBEX makes no warranty or representation, either expressed or
implied, with respect to the Software or its contents, quality, performance, merchantability, or fitness for a particular purpose. In
no event will SYBEX, its distributors, or dealers be liable to you
or any other party for direct, indirect, special, incidental, consequential, or other damages arising out of the use of or inability to
use the Software or its contents even if advised of the possibility of
such damage. In the event that the Software includes an online
update feature, SYBEX further disclaims any obligation to provide this feature for any specific duration other than the initial
posting.
The exclusion of implied warranties is not permitted by some
states. Therefore, the above exclusion may not apply to you.
This warranty provides you with specific legal rights; there may
be other rights that you may have that vary from state to state.
The pricing of the book with the Software by SYBEX reflects the
allocation of risk and limitations on liability contained in this
agreement of Terms and Conditions.
Shareware Distribution
This Software may contain various programs that are distributed
as shareware. Copyright laws apply to both shareware and ordinary commercial software, and the copyright Owner(s) retains all
rights. If you try a shareware program and continue using it, you
are expected to register it. Individual programs differ on details of
trial periods, registration, and payment. Please observe the
requirements stated in appropriate files.
Copy Protection
The Software in whole or in part may or may not be copy-protected or encrypted. However, in all cases, reselling or redistributing these files without authorization is expressly forbidden
except as specifically provided for by the Owner(s) therein.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

This book is dedicated to my two daughters, Denise Boedeker and Dawn


Carpenter. Thank you for letting me be a part of your life and giving me the
opportunity to watch you grow into such wonderful women.
Gary Govanus

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Acknowledgments
You know, this is the toughest part of the entire book to write. You may not
believe it, but it is true. So many people have done so much to get this book on
the shelf and into your hands that it is not possible for me to list them all. I also
have the task of trying to have you understand how important all of those people are to this project. Believe me, it is much easier explaining how Ethernet or
a router works than to try to explain the differences between an acquisition
and development editor, a production editor, and an editor!
Most importantly, I would like to thank my wife, Bobbi, for all her love
and understanding during this process. I have been writing for Sybex almost
continuously for two years, and she has been wonderful during the whole
time. It is a lot harder than it sounds, because when there are deadlines, or
I am trying to teach and to write at the same time, something has to give, and
usually it is the attention I pay to her. She knows how much I love doing this,
so she puts her needs on the back burner. She really is a wonderful woman,
and I am very lucky to have her in my life.
There are others who get shortchanged when I write. I sometimes have to
really work to find the time to make my daughters, Dawn and Denise, crazy.
Fortunately, they are now old enough where they have very full, successful
lives of their own. I dont get to see my grandkids nearly enough, so as soon
as I finish this thing, I am taking them to Disney World. Brandice has been
there before (several times), but CJ and Courtney havent so it will be a treat
for Poppy to see the wonder in their eyes. My parents have not had as much
of my time as they deserve either, and for that I am sorry. Finally, there is my
best friend, John Hartl. John is a quiet man, but can do a wonderful job of
laying guilt. He did it when he pointed out that seeing your best friend once
a year was not enough, and he was tired of me using the *&^% book as an
excuse. He is right!
Now for the people on the production team. This is the first book I have
done with Elizabeth Hurley. Since she approached me with this project, she
has been promoted to an acquisition and developmental editor, a position
she richly deserves. Soon she will be running the place. Her good humor and
infectious laugh always can brighten my day. I hope that this book will justify the faith she has had in me. Every time I came to her with a question, she
would say, Gary, you do what you think is best, I trust you completely.
You have no idea how close we came to changing this book into the novel I
always wanted to write!

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Judith Hibbard is the Production Editor on this project. She is an amazing


woman, because she actually fought to get on this book because she liked
working with me on a Windows 2000 book we did together. She is the stabilizing factor in the process, able to handle with dignity and grace anything
life or I can throw at her. Whenever I would call and rant and rave about
something, she would patiently listen, add insightful comments here and
there, and then bring me back to reality. I just love working with her, and the
wonderful part is that we have nurtured a friendship besides a working relationship. I cherish that.
Linda Stephenson is also truly a brave woman. This is book number three
that she has worked on with me, and she is the editor who has to read all this
stuff and try to make sense out of it. Linda does really write a lot of the book.
I do the research, I try and get the information on the page in some readable
format, but she completes the job. How she does all this with her busy life, I
will never know! She even volunteers to take busloads of children on weeklong
field trips.
Thanks also to my great friend, Becky Monson. Becky and I have been
talking about working together for years, and have never done it. She was
kind of drafted into writing some of the sample test questions, like a couple
of hundred. I am amazed that she is still talking to me! Becky, I told you that
wasnt a fun job!
I would be remiss if I did not also thank Bill Heldman and Jarret Buse. They
helped me make sure that this book provides the widest possible coverage of
CompTIAs Server+ objectives. Their analysis and contributions were invaluable to me.
Thanks also to my two tech editors, Scott Warmbrand and Donald Fuller.
They were the ones who had to read through all the material and straighten
me out when I went technically astray.
Thanks also to all those other people at Sybex that I dont know. These
include the graphic artist, Tony Jonick; the electronic publication specialist,
Susie Hendrickson; the proofreaders, Jennifer Campbell, Nanette Duffy,
Amey Garber, Leslie L. H. Light, and Yariv Rabinovitch; and the indexer,
Nancy Guenther.
A really big thanks to Senoria Bilbo-Brown. She is the one who made sure
the advance checks got to me in short order. Sen, you know I love ya!
And, of course, thanks to you the reader for buying this book and using
it to advance your career. May you not only pass the test, but smoke it!
Gary Govanus

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Introduction

he Server+ certification tests are sponsored by the Computing Technology Industry Association (CompTIA) and supported by several of the
computer industrys biggest vendors (for example, Compaq, IBM, and
Microsoft). This book was written to provide you with the knowledge you
need to pass the exam for Server+ certification. Server+ certification gives
employers a benchmark for evaluating their employees knowledge. When
an applicant for a job says, Im Server+ certified, the employer can be
assured that the applicant knows the fundamental server and networking
concepts. For example, a Server+ certified technician should know the difference between the various types of hard disk subsystems and how to configure them, the differences between various server types, and the
advantages and disadvantages of different network operating systems.
This book was written at an intermediate technical level; we assume that you
already know some of the information in the A+ certification and know about
hardware basics. The exam itself covers basic server topics as well as some more
advanced issues, and it covers some topics that anyone already working as a
technician, whether with computers or not, should be familiar with. The exam
is designed to test you on these topics in order to certify that you have enough
knowledge to intelligently discuss various aspects of server operations.
Weve included review questions at the end of each chapter to give you a
taste of what its like to take the exam. If youre already working as a network
administrator, we recommend you check out these questions first to gauge
your level of knowledge. You can begin measuring your level of expertise by
completing the assessment test at the end of this Introduction. Your score will
indicate which areas need improvement. You can use the book mainly to fill
in the gaps in your current knowledge of servers.
If you can answer 80 percent or more of the review questions correctly for
a given chapter, you can probably feel safe moving on to the next chapter. If
youre unable to answer that many correctly, reread the chapter and try the
questions again. Your score should improve.

Dont just study the questions and answersthe questions on the actual
exam will be different from the practice ones included in this book and on the
CD. The exam is designed to test your knowledge of a concept or objective, so
use this book to learn the objective behind the question.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

xxvi

Introduction

What Is Server+ Certification?


The Server+ certification program was developed by the Computer Technology Industry Association (CompTIA) to provide an industry-wide means of
certifying the competency of a Server Hardware Specialist. The Server+ certified diploma, which is granted to those who have attained the level of
knowledge and troubleshooting skills needed to provide capable support in
the field of networked servers, is similar to other certifications in the computer industry. For example, Novell offers the Certified Novell Engineer
(CNE) program to provide the same recognition for network professionals
who deal with its NetWare products, and Microsoft has its Microsoft Certified Service Engineer (MCSE) program. The theory behind these certifications is that if you need to have service, you would sooner call a technician
who has been certified in one of the appropriate certification programs than
the first so-called expert in the phone book.
What is a Server Hardware Specialist? According to CompTIA, a Server
Hardware Specialist is someone who spends time solving problems to ensure
that servers are functional and applications remain available. The specialist
should have an in-depth understanding of how to plan a network and how to
install, configure, and maintain a server. This should include knowing the hardware that goes into a server implementation, how data storage subsystems
work, the basics of data recovery, and how I/O subsystems work. In short, the
Server Hardware Specialist should have the following competencies:


Have a working knowledge of troubleshooting, physical security, and


disaster recovery. Be able to recover from a server failure.

Provide high availability of servers.

Understand hardware configuration and network connectivity.

Install and configure server hardware to meet the operating system


prerequisites.

Understand current and emerging data storage and data transfer


technologies.

Understand networking protocols.

Provide second-level support for resellers and end users.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Introduction

xxvii

Perform routine maintenance on server systems, data storage systems,


and other network devices

Plan and carry out an upgrade without impacting users.

CompTIA recommends that the candidate for Server+ Certification have


between 18 and 24 months experience in the server technology industry, as
well as some experience running a server. In addition, it is assumed that you
have at least one other IT certification, like the CompTIA A+, a Compaq
ACT, the Novell CNA, Microsoft MCP, HP Star, SCO, or Banyan.

Why Become Server+ Certified?


There are several good reasons to get your Server+ certification. The CompTIA Candidates Information packet lists five major benefits:


It demonstrates proof of professional achievement.

It increases your marketability.

It provides greater opportunity for advancement in your field.

It is increasingly found as a requirement for some types of advanced


training.

It raises customer confidence in you and your companys services.

Provides Proof of Professional Achievement


The Server+ certification is quickly becoming a status symbol in the computer service industry. Organizations with members in the computer service
industry are recognizing the benefits of Server+ certification and are pushing
for their members to become certified. And more people every day are putting the Server+ Certified emblem on their business cards.

Increases Your Marketability


Server+ certification makes individuals more marketable to potential
employers. Also, Server+ certified employees might receive a higher base
salary because employers wont have to spend as much money on vendorspecific training.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

xxviii

Introduction

Provides Opportunity for Advancement


Most raises and advancements are based on performance. Server+ certified
employees work faster and more efficiently, thus making them more productive. The more productive employees are, the more money they will make for
their company. And, of course, the more money they make for the company,
the more valuable they will be to the company. So if an employee is Server+
certified, her chances of getting promoted will be greater.

Fulfills Training Requirements


Server+ certification is recognized by most major computer hardware vendors, including (but not limited to) IBM, Hewlett-Packard, Apple, and Compaq. Some of these vendors will apply Server+ certification toward
prerequisites in their own respective certification programs.

Raises Customer Confidence


As the Server+ certified qualification becomes better known among computer owners, more of them will feel that the Server+ certified person is more
qualified to work on their computer equipment than a non-certified technician is.

How to Become Server+ Certified


Server+ certification is available to anyone who passes the tests. You dont have
to work for any particular company. Its not a secret society. It is, however, an
elite group. In order to become Server+ certified, you must pass the test.
The exam is administered by Prometric and can be taken at any Prometric Testing Center. If you pass the exam, you will get a certificate in the
mail from CompTIA saying that you have passed, and you will also receive
a lapel pin and business card. To find the Prometric training center nearest
you, call (800) 755-EXAM (755-3926).
To register for the tests, call Sylvan at (800) 77-MICRO (776-4276). Youll
be asked for your name, Social Security number (an optional number may be
assigned if you dont wish to provide your Social Security number), mailing
address, phone number, employer, when and where (i.e., which Prometric testing center) you want to take the test, and your credit card number. Arrangement
for payment must be made at the time of registration.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Introduction

xxix

It is possible to pass this test without any reference materials, but only if
you already have the knowledge and experience that come from reading
about and working with servers. Even experienced server people tend to
have what you might call a 20/80 situation with their computer knowledgethey may use 20 percent of their knowledge and skills 80 percent of
the time, and rely on manuals, guesswork, the Internet, or phone calls for the
rest. By covering all the topics that are tested by the exam, this book can help
you refresh your memory concerning topics that, until now, you seldom
used. (It can also serve to fill in gaps that, lets admit, you may have tried to
cover up for quite some time.) Further, by treating all the issues that the
exam covers, this book can serve as a general field guide, one that you may
want to keep with you as you go about your work.

In addition to reading the book, you might consider practicing these objectives
through an internship program. (After all, all theory and no practice make for a
poor technician.)

Who Should Buy This Book?


If you are one of the many people who want to pass the Server+ exam, and
pass it confidently, then you should buy this book and use it to study for the
exam. This book was written with one goal in mind: to prepare you for the
challenges of the real IT world, not just to pass the Server+ exam. This study
guide will do that by describing in detail the concepts on which youll be
tested.

How to Use This Book and CD


This book includes several features that will make studying for the Server+
exam easier. At the beginning of the book (right after this introduction, in
fact) is an assessment test that you can use to check your readiness for the
actual exam. Take the exam before you start reading the book. It will help
you determine the areas you need to brush up on. You can then focus on
these areas while reading the book. The answers to the assessment test
appear on a separate page after the last question of the test. Each answer also
includes an explanation and a note telling you in which chapter this material
appears.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

xxx

Introduction

To test your knowledge as you progress through the book, check out the
review questions at the end of each chapter. As you finish each chapter,
answer the review questions and then check to see if your answers are right
the correct answers appear on the page following the last review question.
You can go back to reread the section that deals with each question you got
wrong to ensure that you get the answer correctly the next time you are
tested on the material.
On the CD-ROM youll find two sample exams. You should test your
knowledge by taking the practice exam when you have completed the book
and feel you are ready for the Server+ exams. Take this practice exam just as
if you were actually taking the Server+ exam (i.e., without any reference
material). When you have finished the practice exam, move on to the bonus
exam to solidify your test-taking skills. If you get more than 90 percent of the
answers correct, youre ready to go ahead and take the real exam.
The CD-ROM also includes several extras you can use to bolster your
exam readiness:
Electronic flashcards You can use these 150 flashcard-style questions to
review your knowledge of Server+ concepts. They are available for PCs
and handheld devices. You can download the questions right into your
Palm device for quick and convenient reviewing anytime, anywhere
without your PC!
Test engine The CD-ROM includes all of the questions that appear in this
book: the assessment questions at the end of this introduction and all of the
chapter review questions. Additionally, it includes a practice exam and a
bonus exam. The questions appear much like they did in the book, but you
can also choose to randomize them. The randomized test will allow you to
pick a certain number of questions to be tested on, and it will simulate the
actual exam. Combined, these test engine elements will allow you to test your
readiness for the real Server+ exam.
Full text of the book in PDF If you are going to travel but still need to
study for the Server+ exam and you have a laptop with a CD-ROM drive, you
can take this entire book with you just by taking the CD-ROM. This book is
in Adobe Acrobat PDF format so it can be easily read on any computer.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Introduction

xxxi

Sybex as Server+ Cornerstone Members


Sybex is proud to serve as a Cornerstone Member of the Server+ Advisory
Committee. Just as CompTIA is committed to establishing measurable standards for certifying individuals who will support computer and networking
systems in the future, Sybex is committed to providing those individuals with
the skills needed to meet those standards. Attaining a Server+ certification is
evidence that the candidate possesses core competencies for vendor-neutral
server integration, management, and service. Server+ is the logical companion
to both the A+ and Network+ certifications. CompTIAs addition of Server+
to their current line of exams creates a true baseline knowledge across the
industry for those seeking to increase their IT skills. Sybex is committed to
bringing that information to you, the future Server+ certified administrator. By
working alongside CompTIA, and in conjunction with other esteemed members of the Server+ exam committee, it is our desire to help bridge the gap in
knowledge and skills that currently confronts the IT industry.
Sybex expects the Server+ program to be well received, both by companies
seeking qualified technical staff and by the IT training community. Along with
the existing line of vendor-neutral certifications from CompTIA, the Server+
certification should prove to be an invaluable asset in the years ahead.

The Exam Objectives


Behind every computer industry exam you can be sure to find exam objectivesthe broad topics in which the exam developers want to ensure your
competency. The official CompTIA Server+ exam objectives are listed here.

Server+ Exam Blueprint


The table that follows lists the areas (or in CompTIA terms, the domains)
measured by this examination and the approximate extent to which they
are represented.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

xxxii

Introduction

Job Dimension

% of Exam (approximate)

1.0: Installation

17%

2.0: Configuration

18%

3.0: Upgrading

12%

4.0: Proactive Maintenance

9%

5.0: Environment

5%

6.0: Troubleshooting and Problem


Determination

27%

7.0: Disaster Recovery

12%

1.0: Installation
1.1 Conduct pre-installation planning activities:


Plan the installation.

Verify the installation plan.

Verify hardware compatibility with operating system.

Verify power sources, space, UPS, and network availability.

Verify that all correct components and cables have been delivered.

1.2 Install hardware using ESD best practices (boards, drives, processors,

memory, internal cable, etc.):




Mount the rack installation.

Cut and crimp network cabling.

Install UPS.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Introduction

xxxiii

Verify SCSI ID configuration and termination.

Install external devices (e.g., keyboards, monitors, subsystems,


modem rack, etc.).

Verify power-on via power-on sequence.

2.0: Configuration
2.1 Check/upgrade BIOS/firmware levels (system board, RAID, control-

ler, hard drive, etc.).


2.2 Configure RAID.
2.3 Install NOS.


Configure network and verify network connectivity.

Verify network connectivity.

2.4 Configure external peripherals (UPS, external drive subsystems, etc.).


2.5 Install NOS updates to design specifications.
2.6 Update manufacturer-specific drivers.
2.7 Install service tools (SNMP, backup software, system monitoring

agents, event logs, etc.).


2.8 Perform Server baseline.
2.9 Document the configuration.

3.0: Upgrading
3.1 Perform full backup:


Verify backup.

3.2 Add Processors:




On single-processor upgrade, verify compatibility.

Verify N 1 stepping.

Verify speed and cache matching.

Perform BIOS upgrade.

Perform OS upgrade to support multiprocessors.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

xxxiv

Introduction

Perform upgrade checklist, including: locate/obtain latest test


drivers, OS updates, software, etc.; review FAQs, instructions,
facts and issues; test and pilot; schedule downtime; implement
ESD best practices; confirm that upgrade has been recognized;
review and baseline; document upgrade.

3.3 Add hard drives:




Verify that drives are the appropriate type.

Confirm termination and cabling.

For ATA/IDE drives, confirm cabling, master/slave, and potential


cross-brand compatibility.

Upgrade mass storage.

Add drives to array.

Replace existing drives.

Integrate into storage solution and make it available to the operating


system.

Perform upgrade checklist, including: locate and obtain latest test


drivers, OS updates, software, etc.; review FAQs, instructions,
facts and issues; test and pilot; schedule downtime; implement
using ESD best practices; confirm that the upgrade has been
recognized; review and baseline; document the upgrade.

3.4 Increase memory:




Verify hardware and OS support for capacity increase.

Verify memory is on hardware/vendor compatibility list.

Verify memory compatibility (e.g., speed, brand, capacity, EDO,


ECC/non-ECC, SDRAM/RDRAM).

Perform upgrade checklist, including: locate and obtain latest test


drivers, OS updates, software, etc.; review FAQs, instructions,
facts and issues; test and pilot; schedule downtime; implement
using ESD best practices; confirm that the upgrade has been recognized; review and baseline; document the upgrade.

Verify that server and OS recognize the added memory

Perform server optimization to make use of additional RAM.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Introduction

xxxv

3.5 Upgrade BIOS/firmware:




Perform upgrade checklist, including: locate and obtain latest test


drivers, OS updates, software, etc.; review FAQs, instructions, facts
and issues; test and pilot; schedule downtime; implement using ESD
best practices; confirm that the upgrade has been recognized;
review and baseline; document the upgrade.

3.6 Upgrade adapters (e.g., NICs, SCSI cards, RAID, etc.):




Perform upgrade checklist, including: locate and obtain latest test


drivers, OS updates, software, etc.; review FAQs, instructions,
facts and issues; test and pilot; schedule downtime; implement
using ESD best practices; confirm that the upgrade has been recognized; review and baseline; document the upgrade.

3.7 Upgrade peripheral devices, internal and external:




Verify appropriate system resources (e.g., expansion slots, IRQ,


DMA, etc.).

Perform upgrade checklist, including: locate and obtain latest test


drivers, OS updates, software, etc.; review FAQs, instructions, facts
and issues; test and pilot; schedule downtime; implement using ESD
best practices; confirm that the upgrade has been recognized; review
and baseline; document the upgrade.

3.8 Upgrade system monitoring agents:




Perform upgrade checklist, including: locate and obtain latest test


drivers, OS updates, software, etc.; review FAQs, instructions,
facts and issues; test and pilot; schedule downtime; implement
using ESD best practices; confirm that the upgrade has been recognized; review and baseline; document the upgrade.

3.9 Upgrade service tools (e.g., diagnostic tools, EISA configuration,

diagnostic partition, SSU, etc.):




Perform upgrade checklist, including: locate and obtain latest test


drivers, OS updates, software, etc.; review FAQs, instructions, facts
and issues; test and pilot; schedule downtime; implement using ESD
best practices; confirm that the upgrade has been recognized; review
and baseline; document the upgrade.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

xxxvi

Introduction

3.10 Upgrade UPS:




Perform upgrade checklist, including: locate and obtain latest test


drivers, OS updates, software, etc.; review FAQs, instructions, facts
and issues; test and pilot; schedule downtime; implement using ESD
best practices; confirm that the upgrade has been recognized; review
and baseline; document the upgrade.

4.0: Proactive Maintenance


4.1 Perform regular backup.
4.2 Create baseline and compare performance.
4.3 Set SNMP thresholds.
4.4 Perform physical housekeeping.
4.5 Perform hardware verification.
4.6 Establish remote notification.

5.0: Environment
5.1 Recognize and report on physical security issues:


Limit access to server room and backup tapes.

Ensure physical locks exist on doors.

Establish anti-theft devices for hardware (lock server racks).

5.2 Recognize and report on server room environmental issues (temper-

ature, humidity/ESD/power surges, backup generator/fire suppression/flood considerations).

6.0: Troubleshooting and Problem Determination


6.1 Perform problem determination:


Use questioning techniques to determine what, how, when.

Identify contact(s) responsible for problem resolution.

Use senses to observe problem (e.g., smell of smoke, observation


of unhooked cable, etc.).

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Introduction

xxxvii

6.2 Use diagnostic hardware and software tools and utilities:




Identify common diagnostic tools across the following operating


systems: Microsoft Windows NT/2000, Novell Netware, UNIX,
Linux, and IBM OS/2.

Perform shutdown across the following operating systems:


Microsoft Windows NT/2000, Novell Netware, UNIX, Linux,
and IBM OS/2.

Select the appropriate tool.

Use the selected tool effectively.

Replace defective hardware components as appropriate.

Identify defective FRUs and replace with correct part.

Interpret error logs, operating system errors, health logs, and critical
events.

Use documentation from previous technician successfully.

Locate and effectively use hot tips (e.g., fixes, OS updates, E-support,
Web pages, CDs).

Gather resources to get problem solved.

Identify situations requiring call for assistance.

Acquire appropriate documentation.

Describe how to perform remote troubleshooting for a wake-on-LAN.

Describe how to perform remote troubleshooting for a remote alert.

6.3 Identify bottlenecks (e.g., processor, bus transfer, I/O, disk I/O, network

I/O, memory).
6.4 Identify and correct misconfigurations and/or upgrades.
6.5 Determine if problem is hardware, software, or virus related.

7.0: Disaster Recovery


7.1 Plan for disaster recovery:


Plan for redundancy (e.g., hard drives, power supplies, fans,


NICs, processors, UPS).

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

xxxviii

Introduction

Use the technique of hot swap, warm swap and hot spare to
ensure availability.

Use the concepts of fault tolerance/fault recovery to create a disaster recovery plan.

Develop disaster recovery plan.

Identify types of backup hardware.

Identify types of backup and restoration schemes.

Confirm and use off site storage for backup.

Document and test disaster recovery plan regularly, and update as


needed.

7.2 Restoring.


Identify hardware replacements.

Identify hot and cold sites.

Implement disaster recovery plan.

Tips for Taking the Server+ Exam


Here are some general tips for taking your exam successfully:


Bring two forms of ID with you. One must be a photo ID, such as a
drivers license. The other can be a major credit card or a passport.
Both forms must have a signature.

Arrive early at the exam center so you can relax and review your study
materials, particularly tables and lists of exam-related information.

Read the questions carefully. Dont be tempted to jump to an early


conclusion. Make sure you know exactly what the question is asking.

Dont leave any unanswered questions. Unanswered questions are


scored against you.

There will be questions with multiple correct responses. When there is


more than one correct answer, a message at the bottom of the screen
will prompt you to Choose all that apply. Be sure to read the messages displayed.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Introduction

xxxix

When answering multiple-choice questions youre not sure about, use


a process of elimination to get rid of the obviously incorrect options
first. This will improve your odds if you need to make an educated
guess.

On form-based tests, because the hard questions will eat up the most time,
save them for last. You can move forward and backward through the
exam. When the exam becomes adaptive, this tip will not work.

For the latest pricing on the exams and updates to the registration procedures, call Prometric at (800) 755-EXAM (755-3926) or (800) 77-MICRO
(776-4276). If you have further questions about the scope of the exams or
related CompTIA programs, refer to the CompTIA site at www.comptia.org/.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Assessment Test
1. In a Fibre Channel configuration, what constitutes a point-to-point link?
A. Arbitrated loop
B. Fabric
C. A bidirectional link that connects the N_ports on two nodes
D. Two NL_ports connected to two FL_ports
2. What do you call a list of IP addresses that can be assigned by an auto-

matic server process?


A. A DNS scope
B. A DHCP scope
C. FTP
D. Lost
3. You have just purchased a mainboard that supports dual processors.

Which Pentium III processors can be used on the board?


A. Any Xeon with any P-II
B. P-IIIs of the same speed
C. Any P-III
D. Any P-II with any P-III
4. If you have a RAID 3 system made up of four 20GB drives, how much

usable disk storage space would you have?


A. 80GB
B. 60GB
C. 40GB
D. 20GB
E. Not enough information to make a determination.
5. The performance of a RISC processor depends on which of the following?
A. The code it is executing
B. The speed of the network card
C. The amount of RAM
D. The video card

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

xli

Assessment Test

6. Memory Interleaving is another way of doing which of the following:


A. Error checking
B. Accessing information stored on the memory chip
C. Determining parity
D. Installing chips
7. What happens when an ECC memory module determines that corruption

has occurred in 1 bit?


A. The problem is immediately corrected and the end user is none the

wiser.
B. An error message pops up on the screen describing the error to the
end user and giving the user a chance to fix the problem.
C. An entry is made in the memory error log, but the system continues
to operate.
D. The system is halted.
8. What happens when a parity-checking memory module determines

that corruption has occurred?


A. The problem is immediately corrected and the end user is none

the wiser.
B. An error message pops up on the screen describing the error to the

end user and giving the user a chance to fix the problem.
C. An entry is made in the memory error log, but the system continues
to operate.
D. The system is halted.
9. How many interrupts are available with PCI?
A. 64
B. 32
C. 16
D. 8
10. How can you configure load balancing in a PCI Bridged environment?
A. Configure one bridge as a master, and the other as a slave.
B. You will have to buy special devices to make this work.
C. You will have to purchase a special connector.
D. Load balancing is not recommended in bridged environment.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Assessment Test

11. What are the three software layers of I2O?


A. OS/2
B. OSM
C. NLM
D. IML
E. CDM
F. HDM
12. What are three possible configurations for an ATA/IDE device?
A. Master, with slave present
B. Slave, with a master present
C. Slave, no master present
D. Master, no slave present
E. Leader
F. Follower
13. What is the bandwidth of PCI?
A. 133 MBytes/second
B. 156 MBytes/second
C. 64 MBytes/second
D. 32 MBytes/second
14. What is the N_ports unique address called?
A. N_port Identifier
B. Well-known port address
C. IP address
D. VPN address
15. When cabling a building, what should you do?
A. Only use fiber optic cable.
B. Always use copper conduit.
C. Always check local building codes.
D. Assume that you do not need a permit.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

xlii

xliii

Assessment Test

16. Four network cards grouped together for Load Balancing will have

how many IP addresses?


A. Four
B. Three
C. Two
D. One
17. ATA 100 can also be referred to as which of the following?
A. ATA Parallel
B. ATA Serial
C. ATA
D. ATA Bipolar
18. Why would you install a bridge?
A. To route packets
B. To minimize traffic on a network segment
C. For security purposes
D. To dynamically assign IP Addresses
19. You have a single network card with four ports on it. What can that

card not be configured to do?


A. Adapter Load Balancing
B. Adapter Teaming
C. Adapter Fault Tolerance
20. You have a very important database on your network. You decide to

check to make sure the database is getting backed up, so you try to
restore one of the files to another server. You find the file was not
backed up. What is a likely reason for this happening?
A. The file had not been accessed that day.
B. The tape backup program cannot back up open files.
C. The tape backup program cannot back up files that big.
D. The tape backup program did not run.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Assessment Test

xliv

21. In every SCSI-3 bus, how many terminators are there?


A. Four
B. Three
C. Two
D. One
E. One per device
22. Which type of application server would be used by gamers?
A. Peer to peer
B. Distributed
C. Dedicated
D. None of the above
23. Pick the levels of cache that can be present in a computer with a Pentium

III Gigahertz processor.


A. L1
B. L2
C. L3
D. L4
24. If a DHCP Server is not on the same subnet as the hosts it serves, what

must be configured?
A. A DNS Server
B. A relay Agent
C. Another DHCP Server
D. SMTP
E. DMI
25. Name three ways NICs can work together.
A. Adapter Grouping
B. Adapter Fault Tolerance
C. Adapter Virtual Private Networks
D. Adapter Load Balancing
E. Adapter Teaming

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

xlv

Assessment Test

26. What does PIO stand for?


A. Progressive Input/Output
B. Processor Input/Output
C. Programmable Input/Output
D. Programmed Input/Output
27. If a server is rated at 4U, what does that mean?
A. It is a rack-mounted server that will use 4 milliamps of power.
B. It is a rack-mounted server that will use 400 volts of power.
C. It is a rack-mounted server that can fit in a 40-inch rack.
D. It is a rack-mounted server that will cover four of the mounting

holes in the rack.


28. What type of server resolves DNS names to IP addresses?
A. DHCP
B. DNS
C. UDP
D. SMTP
29. Why are maintenance logs important?
A. They provide a clear picture of what the service techs have been

doing.
B. They provide a background of what has been done to a computer.
C. They provide an instruction manual for doing routine tasks.
30. What is the plenum?
A. The type of metallic shielding surrounding a fiber optic cable
B. The type of cable used in fiber optic installations
C. The air space between the ceiling and the actual roof of a building
D. Precious metal like gold
31. With which Internet standard protocol is Active Directory accessed?
A. SNMP
B. SMTP
C. LDAP
D. POP3

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Assessment Test

xlvi

32. Will an ATA 100 device use the same type cable as an ATA 66 device?
A. No
B. Yes
33. A BNC connector is used on what type of Ethernet implementation?
A. ThinNet
B. Thicknet
C. UTP
D. STP
34. How many terminators are there on a ThinNet network?
A. One
B. Two
C. One for every 50 hosts
D. One for every 100 hosts
35. Which is true of fiber optics?
A. It is affected by EMI.
B. It is affected by heat.
C. The cable can be made of glass.
D. The cable is always made of copper.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

xlvii

Answers to Assessment Test

Answers to Assessment Test


1. C. In Fibre Channel, a point-to-point connection is a bidirectional link

that connects the N_ports on two nodes.


2. B. It is called a DHCP scope.
3. B. With Pentium III processors, the multiplier and the FSB must

match.
4. B. You would have three 20GB drives for storage and one 20GB drive

for parity. Therefore, you would have 60GB of usable storage space.
5. A. The performance of the RISC processor depends on the code it is

executing.
6. B. Memory interleaving is a way of quickly getting access to informa-

tion stored on the memory chip.


7. A. With an ECC memory module, if there is a problem with 1 bit, it

will be detected and corrected. ECC memory can determine corruption of up to 4 bits, but with anything over 1 bit, the system is halted.
8. D. With parity, if it is determined there has been some corruption, the

system is halted.
9. C. PCI can use up to 16 interrupts.
10. D. If you are using a bridged architecture, load balancing is not rec-

ommended.
11. B, D, F. I2O is made up of three software layers: the OS Services Mod-

ule (OSM), the I2O Messaging Layer (IML), and the Hardware Device
Module (HDM).
12. A, B, D. IDE devices can be a master with no slave present, a master

with a slave present, and a slave with a master present.


13. A. The PCI bandwidth is 133 MBytes/second.
14. A. The N_port has a unique address, called the N_port Identifier
15. C. When cabling a building, you should check the local building

codes. These codes will vary by locality.


16. One. A group of network cards used in Load Balancing will have one

IP address.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Answers to Assessment Test

xlviii

17. B. ATA 100 can also be referred to as ATA Serial.


18. B. A bridge would be used to minimize traffic between network

segments.
19. C. Fault Tolerance requires at least two cards, not at least two ports.
20. B. Many tape backup programs are not capable of backing up open

database files without an add-in component.


21. C. There will always be two terminators, one at each end of the SCSI

chain.
22. A. A peer to peer application server would be the type that may be

used by gamers.
23. A, B. There are only two types of cache, L1 and L2.
24. B. A relay Agent must be configured.
25. B, D, E. Adapters can work together with Load Balancing, Fault Tol-

erance, or Teaming.
26. D. PIO is the abbreviation for Programmed Input/Output.
27. D. The U rating is the number of mounting holes that the device will

utilize. A U is 4.445 centimeters or 1.75 inches.


28. B. A DNS Server resolves a DNS name to an IP address.
29. B. Maintenance logs provide a background of what has been done to

a computer.
30. C. The plenum is the space between the drop down ceiling and the

roof where you can run cables.


31. C. Active Directory is based on LDAP.
32. A. No, with ATA 100 the cabling changes.
33. A. BNC Connectors are used on ThinNet Ethernet networks.
34. A. There is one and only one terminator on a ThinNet network.
35. C. Fiber cable is made of glass or plastic.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Chapter

Disk Subsystems
SERVER+ EXAM OBJECTIVES COVERED IN
THIS CHAPTER:
 1.2 Install hardware using ESD best practices (boards, drives,
processors, memory, internal cable, etc.).


Mount the rack installation.

Cut and crimp network cabling.

Install UPS.

Verify SCSI ID configuration and termination.

Install external devices (e.g., keyboards, monitors,


subsystems, modem rack, etc.).

Verify power-on via power-on sequence.

 2.2 Configure RAID.


 3.3 Add hard drives.


Verify that drives are the appropriate type.

Confirm termination and cabling.

For ATA/IDE drives, confirm cabling, master/slave, and


potential cross-brand compatibility.

Upgrade mass storage.

Add drives to array.

Replace existing drives.

Integrate into storage solution and make it available to the


operating system.

Perform upgrade checklist, including: locate and obtain


latest test drivers, OS updates, software, etc.; review FAQs,
instructions, facts and issues; test and pilot; schedule
downtime; implement using ESD best practices; confirm that
the upgrade has been recognized; review and baseline;
document the upgrade.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

 3.6 Upgrade adapters (e.g., NICs, SCSI cards, RAID, etc.).




Perform upgrade checklist, including: locate and obtain


latest test drivers, OS updates, software, etc.; review FAQs,
instructions, facts and issues; test and pilot; schedule
downtime; implement using ESD best practices; confirm that
the upgrade has been recognized; review and baseline;
document the upgrade.

 7.1 Plan for disaster recovery.




Plan for redundancy (e.g., hard drives, power supplies, fans,


NICs, processors, UPS).

Use the technique of hot swap, warm swap, and hot spare to
ensure availability.

Use the concepts of fault tolerance/fault recovery to create a


disaster recovery plan.

Develop disaster recovery plan.

Identify types of backup hardware.

Identify types of backup and restoration schemes.

Confirm and use off site storage for backup.

Document and test disaster recovery plan regularly, and


update as needed.

 7.2 Restoring


Identify hardware replacements.

Identify hot and cold sites.

Implement disaster recovery plan.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

ont you just hate it? You buy the darn book, hoping to be
slowly and gently eased into the studying process, and in the very first chapter
the author nails you with a ton of objectives. Well, take heart, if you have
passed the Network+ test and the A+ test, about 20% of the material in this
book will be old hat! You will already know it.
The Server+ exam is designed to give you background into the inner
workings of your local server platform. Throughout this book we are going
to be talking about the different types of hardware that make up a server, the
different types of servers that can be put into your network, the different
types of server operating systems, how to care for your servers, and how to
fix them if they break. I suppose we could subtitle this book, The Care and
Feeding of Network Servers.
To make this daunting task a little easier, we are going to break it down
into chunks. As you can see, the first chunk deals with the disk subsystem of
the server. In this first chapter, we will cover four basic areas: Logical and
Physical Drives, SCSI, RAID, and hot swappable. Well spend some time
hashing out terminology, discussing strengths and weaknesses, and looking
at fault tolerances. So lets get to it.

For complete coverage of objective 1.2, please also see Chapters 6, 7, and 8. For
complete coverage of objective 3.3, please also see Chapter 2. For complete coverage of objective 3.6, please also see Chapters 2 and 10. For complete coverage
of objectives 7.1 and 7.2, please also see Chapter 12.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Chapter 1

Disk Subsystems

Its All About Perception

In this section, we are going to talk about Logical Drives and Physical
Drives and describe their functionality. We will take a look at how the people
who use the server view the disk subsystem.
Now, if your users are like all the users I have ever dealt with, 95% of
them dont care how or why the network operatesthey just want it to work
every time. Does that sound familiar? When it gets into the subject of drives,
they could care less, as long as they can store and retrieve their information.
And that is just the way it should be.

Logical Drives
Every network that I have ever worked on has had a drive letter mapped to
an area that was fondly referred to as the Users home directory. Depending
on the network operating system you are using, it may be called the Users
directory, or the mount point, or something else esoteric, but every system
has one. This is the place where your users can store their highly personal,
private user stuff. You know, like all the jokes they received via e-mail over
the weekend. Anyway, to make this drive easier for users to access, it is
assigned a drive letter; for instance, in my network it is the H: drive. Now,
users dont refer to this area as their Users home directorythey simply call
it their H: drive. Well, what exactly is their H: drive?
If you were to ask an end user that question, he would probably tell you
that somewhere, back in the deep dark reaches of the computer room, there
would be a wall. Mounted on this wall would be dozens of physical hard
drives and each one of these hard drives would have a little plastic strip on
it, with a name. So, if you wanted to find Elizabeths H: drive, you would
simply find the wall, look to the strips that handled the Es, and there, about
halfway down, would be the drive for Elizabeth. In reality, the Users home
directory is just thatit is a directory that is part of a much larger directory
structure. Using Microsoft Explorer, Figure 1.1 is a small sample of a Users
directory.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Its All About Perception

FIGURE 1.1

A sample of the Users directory

Now, I suppose right from the start we should get the biases out in the open.
First of all, I have been working with computers since the 80s, back when DOS
was king and GUI was something slimy. So, I have some problems that you
should know about. The first is with the interchangeable term directories and
folders. In the world of GUI, a folder represents a storage area created on a
disk for the storage and retrieval of information. In the world of DOS, the same
thing was called a directory. Being an old dog, I find it hard to learn new tricks,
so if you see the word directory, and you are more comfortable with folder, go
for it. In my world, in this context, they are interchangeable.

Depending on the network operating system you are using, these areas are
referred to as mapped drives, shares, or mount points. It all amounts to the
same thing. A drive letter has been assigned as a pointer to a particular directory or folder on a bigger physical device. It does not even have to be a network operating system. DOS will support up to 23 Logical Drives on a

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Chapter 1

Disk Subsystems

system. As far as the user is concerned, it is a drive just like the C: drive in
their computer. As far as you are concerned, it is a Logical Drive, or drive letter that has been assigned as a pointer to a distinct directory or folder on a
larger physical device.

Physical Drives
If Logical Drives are pointers to directories or folders on physical devices, it
makes sense that the Physical Drive is what you can hold on to and install
into a file server. A Physical Drive can be a hard drive, a floppy drive, or even
removable storage.
These drives come in a variety of sizes, from the standard 1.44MB floppy
drive to the hard drives that can go over 30 gigabytes. The hard drives also
come in a variety of different technologies and configurations, with a wide
variety of different acronyms that you are going to have to be familiar with
things like IDE, EIDE, ATA, SCSI, RAID, hot swap, hot plug, and even hot
spare. Over the rest of this chapter and the next chapter, we will be talking
about the differences between these.

Real World Scenario


These little ditties will be thrown in throughout the book just to make sure that
we keep both feet on the ground and realize that there really is a difference
between the CompTIA testing world and the real world you work in every day.
Hopefully, the Real World Scenarios will help bring home some of the points
we discussed, as well as solidify how the information can be used.
The best way to track the difference between a logical drive and a physical
drive is to think of this common scenario: Many times I have had an
administrative assistant call me because he wanted some special access
to his bosss home directory. The problem was, the administrative assistant didnt know that it was his bosss home directory he wanted access
tohe thought it was the H: drive! When I would grant him rights to the
bosss home directory, I would map a drive for the administrative assistant, and give it a different drive letter, like the I: drive. The discussion
would then ensue that the assistant did not want access to the I: drivehe
needed access to the bosss H: drive. The whole thing could get really ugly.
Remember that a physical drive can contain hundreds of logical drives, but
a logical drive usually resides on just one physical drive.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

The World of SCSI

The World of SCSI

There are several limitations to the way EIDE handles devices. The major
limitations are the number of devices that can be controlled from a single paddle
card and the lack of redundancy. If one device fails, your entire subsystem is
destroyed. Small Computer System Interface (SCSI), on the other hand,
addresses those issues. SCSI is just another type of interface that is much more
extensible than IDE. Besides hard drives, SCSI will work with CD-ROMs and all
sorts of other wonderful things. Even from its early days, SCSI (pronounced
scuzzy) allowed you to have an unlimited number of devices strung together in
what is referred to as a daisy chain. SCSI was originally designed as a high-speed
system-level parallel interface. SCSI has evolved, and now there are all sorts of
different levels and speeds. We will explore each of them in this section.
Lets start with the definitions of the interfaces and how they are used.

SCSI-1 Narrow and Regular


One of the best things about SCSI is that SCSI devices are very flexible. They
can be installed in just about anythingMacs, PCs, SUN workstationsjust
about anything. It there is a SCSI controller made for it, the SCSI peripheral
will work with it. Also with SCSI, there is an intelligent controller. This SCSI
controller can be an installed interface board, or it can be built in to the
mainboard, just like the IDE paddle card. The difference is that the SCSI controller actually does make a difference in the way it controls devices. SCSI is
defined as an intelligent peripheral interface that makes use of high-level
communication between devices like hard drives or tape drives. Communications occur between an initiator (normally the computer) and a target
(usually the peripheral). Data transfer can occur in an asynchronous mode,
meaning it is not clocked, or in synchronous (clocked) mode.

When we start talking about synchronous and asynchronous, the term clocked
comes into play. If you are not familiar with what it means, think of your dial-up
connection to the Internet. In this case, you are using an asynchronous modem.
Communication occurs randomly, when and where you want it to. Since there is
no regularity, this is not clocked. With synchronous, whenever a communication
link is established, certain tasks are carried out at regularly timed intervals, and
therefore they are clocked.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Chapter 1

Disk Subsystems

Any messages and commands are transferred asynchronously. The data


bus in SCSI-1 was 8-bit, so it was considered SCSI Narrow.
The command process went something like this. Take a look at Figure 1.2.
FIGURE 1.2

SCSI chain for communication explanation

SCSI controller

Device 1

Device 2

Device 3

Device 4

As you can see, we have a SCSI controller and four SCSI devices. These
devices could be four hard drives or three hard drives and a tape drive, or
two hard drives, a tape drive, and a CD-ROM. You get the point. Anyway,
suppose the computer sent messages to the controller to write information to
Device 1 and to Device 3. With SCSI, the controller would send a signal to
Device 1 telling it that the controller had some work for it to do. Device 1
would respond, and the controller would send the information. Device 1
would send back an acknowledgement and the controller would now go on
to the information that had to be written to Device 3. The trick here is that
the controller would do things one step at time, and could not multitask.
That was changed in SCSI-2.
So, how does the controller know which device is which? Well, just like in
the diagram, each device is assigned its own unique device number, called the
SCSI address. The entire SCSI subsystem is referred to as the SCSI bus. The
SCSI address was configured in a variety of ways, including jumpers or rocker
switches. That way, when the controller needed to send information to that
device, it just used the appropriately addressed wire. In order to keep the signals on the wire, each SCSI bus had to be terminated at both ends. We will talk
more about termination after we get through defining the types of SCSI.
Not only is SCSI flexible in the kinds of computers it can work in, SCSI
is also flexible in the kinds of devices it can work with. For example, SCSI
can work with tape drives, hard drives, and CD-ROMs, to name a few.
These devices can be internal to the computer or external, in a separate case.
If the devices were internal, they would use a 50-pin ribbon cable. If the
devices were external, they would use a very thick, shielded cable that had a
Centronics 50-pin adapter on one end and a DB-25 connector on the other.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

The World of SCSI

SCSI was defined in 1986 and is a published American National Standards


Institute (ANSI) standard called SCSI-1 (X3.1311986).

SCSI-2 Wide and Fast


As SCSI technology began to mature, it received another ANSI standard
(X3.1311994). This was truly an upgrade from the previous version,
because it featured faster data transfer rates, and it also mandated the
structure of messages and commands to help improve compatibility.
With SCSI-2, communication shifted to synchronous data transfer, and the
width of the bus grew to 16 bits. The data transfer rate went from 2.5 to 10
Mbytes/second on an 8-bit data bus and from 5.0 to 20.0 Mbytes/second for
a 16-bit data bus. This change also necessitated changes to the connectors,
using a higher density connector, as shown in Figure 1.3.
FIGURE 1.3

A SCSI-2 connector

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

10

Chapter 1

Disk Subsystems

Now just to confuse you, when they widened the data transfer path from
8 to 16 bits, the 16-bit version of SCSI-2 was also referred to as Wide SCSI.
With the wider bus, the transfer speed climbed to 10.0 Mbytes/second; this
was referred to as Fast SCSI. With Fast SCSI, there was also some new terminologyinstead of Mbytes/second, there is Mega Transfer (MT). The MT
is a unit of measurement that refers to the rate of signals on the interface,
regardless of the width of the bus. So, as an example, if you have a 10MT
rate on a Narrow SCSI bus, the transfer rate would be 10 Mbytes/second. If,
however, you had the same 10MT rate on a Wide bus, it would result in a
20 Mbyte/second transfer rate. The developers finally took the Wide SCSI
technology and combined it with Fast SCSI, and that became Fast-Wide
SCSI, with a transfer speed of 40 Mbytes/second.
SCSI-2 was backwardly compatible with SCSI, but for maximum benefit, it
was suggested that you stick with one technology or the other, preferably
using a SCSI-2 controller with SCSI-2 devices. With both SCSI-1 and SCSI-2,
the number of peripherals that could be connected to any controller was seven.

SCSI-3
Although SCSI is maturing, it is not completely there yet. There are still some
limitations, like having no more than seven devices connected to any controller.
The next generation of SCSI, SCSI-3, takes care of some of that.
Now SCSI-3 is still a proposed ANSI standard, but there are a lot of devices
out there purporting to be SCSI-3. That is because the SCSI-3 documentation
took the very large SCSI-2 specifications (in excess of 400 pages) and split it
into smaller bite size chunks. These smaller documents cover different layers of
how the interface will be defined. For example, the following layers are
included:


physical, which covers things like the connectors, the pin assignments,
and the electrical specifications. This document is called SCSI Parallel
Interface (SPI).

protocol, which covers all the physical layer activity and how it is
organized into bus phases and packets.

architecture, which provides a description of how the command


requests interact with protocols. This can include things like how the
requests are organized, how the requests are queued, and how each
protocol responds to a request.

primary command, which contains the list of commands that have to


be supported by all SCSI devices.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

The World of SCSI

11

device-specific commands, which would include things like all the


commands that drive CD-ROMs or WORM drives.

Now, when the standards folks started working on this, they recognized
how quickly things were changing, so they layered the specifications to allow
substitution in different parts of the specifications as the technology evolves.
One example would be the standards for the SCSI Fibre Channel interface disk
drive. In this case, the physical and protocol layers would have to be replaced
with new documents, but the other three layers could remain the same.
So, since the newest features are going to show up in SCSI-3, and since
SCSI-3 will be generally higher-performing, you can expect that a SCSI3 device will exhibit better performance than its SCSI-2 brethren. One of
the first things people realized with SCSI-3 was that the number of
peripherals changed. Now, you could have a maximum of 16 devices.
Since there was the possibility of having 16 devices on the chain, the
length of the cable had to increase also. SCSI-3 also saw the added support for a serial interface and for a fiber optic interface. Data transfer
rates depended on the way the hardware was implemented, but the data
rates could actually climb to hundreds of megabytes per second.
Now it is time to get into some of the ways the SCSI-3 standards are
broken up.

SCSI Fast-20 (Ultra)


SCSI Fast-20 is the specification that provides a 20MT/second transfer rate.
That means that the data rate is twice as fast as the SCSI Fast rate. With SCSI
Fast-20, on a Wide bus, the data transfer rate would be 40 Mbytes/second.
This is also referred to as Ultra SCSI.

Dont you wish they would come up with just one name for this stuff and
stick with it?

SCSI Fast-40 (Ultra 2)


SCSI Fast-40 is a set of specifications to define the timing for a future revision of
SCSI-3. This will achieve rates of 40 MT/second, which result in data transfer

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

12

Chapter 1

Disk Subsystems

rates of 80 Mbytes/second. This is referred to as Ultra 2 SCSI. And this is where


things really start to get interesting.
As you may have noticed, each iteration of SCSI had the data transfer
rate doubling. In each case, the speed increase was handled by the bus clock
speed doubling, causing the maximum signaling frequency to double. As
you will see later in the chapter, this was all done using something called
Single Ended signaling. Because the way the information was moved over
the wires did not change, there was backward compatibility. But that compatibility came with a price.
All SCSI data transfers are carried out over cables that are made up of
metallic wires. As signals cross any form of metallic wire, they degrade, and
with this degradation comes a weaker signaland a weak signal is a signal
that is subject to distortion. In order to counteract the weak signal, something
in the SCSI specifications had to change, and usually that was the cable length.
With Ultra SCSI, the cable length had shrunk to just 1.5 meters, which meant
that it would be really difficult to connect 15 devices. Since halving the cable
length again was unacceptable, when Ultra 2 SCSI came out, the signaling
method changed from Single Ended to Low Voltage Differential (LVD).
The switch to LVD allowed the cable length to climb to 12 meters. That
was the good news. The bad news was that moving to LVD required new
dual-mode bus terminators and longer cables. Basically, things stopped
being backwardly compatible. If you happened to attach an Ultra device to
an Ultra 2 cable, the entire subsystem stopped working, completely. People
learned in a hurry to keep Ultra 2 subsystems totally Ultra 2.

Ultra 3 SCSI (Ultra160)


With Ultra 3 SCSI things can get really interesting because the specifications
define as many as 63 variations of features that can be present. A specific set of
these features has been defined by the industry, and that is what is referred to as
Ultra160. The data transfer rate, not surprisingly given the name, is up to 160
Mbytes/second on a Wide bus only.

Ultra320
Ultra320 SCSI is the one that is not off the drawing board yet, but it is going
to feature data transfer rates up to 320 Mbytes/second. Ultra320 was first
defined in SPI-4.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

The World of SCSI

13

Single Ended Devices


Single Ended Devices is really kind of a misnomer, because Single Ended
really refers to the way a SCSI cable is driven by the SCSI devices. Normal
SCSI, or SCSI-1, can also be called Single Ended SCSI. It works like this. Any
time a signal needs to be sent across the bus, there has to be a wire to carry
the signal. With Single Ended SCSI drives, one signal line is grounded and the
other is compared to ground. This affects susceptibility to noise limits, which
impacts things like the maximum allowable cable lengths. So, for example,
SCSI-1 can use cabling up to 6 meters (19.7 feet) long. A Fast SCSI implementation with transfer speeds of up to 10 MT/second can have cabling that
is up to 3 meters (9.8 feet) long. Are you beginning to notice a trend? Think
of it this way: The faster the speed of data transfer, the shorter the cable. For
Ultra SCSI, which will transfer up to 20 MT/second, the cabling can be up
to 1.5 meters (4.9 feet) long if there are more than five active IDs. If there are
less than five active IDs, then the cable can be up to 3 meters (9.84 feet) in
length.

If the cable is not Single Ended, it is Differential SCSI. Differential SCSI comes in
Low Voltage Differential or High Voltage Differential, and these devices are not
compatible on the same bus segment without an electronic device such as a SCSI
converter to convert between Single Ended and Differential. With rare exception,
no software (driver) modifications are necessary for conversion between Single
Ended and Differential. There are several variations of terminators developed for
use with Single Ended SCSI and Differential SCSI.

So, what this means to you the server administrator is confusion. See, the
cable that is used for Single Ended SCSI and the cable that is used for Differential SCSI look the same, even though they are electrically different. To

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

14

Chapter 1

Disk Subsystems

make matters worse, both Single Ended and Differential can use each of the
cable types listed in Table 1.1.
TABLE 1.1

SCSI Cable Type


Cable Type

Characteristics

Type A cable

This is the original SCSI cable. It


contains 50 wires and can be used
by itself for Narrow 8-bit SCSI or in
a combination with a B cable for
Wide 16-bit SCSI.

Type B cable

Added in SCSI-2 to provide a path


for Wide SCSI. It has 68 connections,
and in the early days, it was used
with a type A cable. Since having
two cables proved to be a pain, this
was replaced by the new P cable
defined for SCSI-3.

Type P cable

Instead of containing 50 wires,


type P cable contains 68. It is the
new standard for Wide SCSI
implementations of all speeds.

Firewire cable

A 6-wire cable designed for use with


serial SCSI devices connected by
Firewire. This cannot be used with
any other implementations of SCSI.

Figure 1.4 shows the different types of cable ends for different types of
SCSI devices. About the only way to tell the difference between Single Ended
Devices and Differential Devices is with the judicious use of a volt/ohm
meter.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

The World of SCSI

FIGURE 1.4

15

SCSI cable connectors

Host, or Host Adapter

Connector

Most SCSI Slow (5 Mbytes/sec) computers


and Host adapters use the Centronics-type
50-pin connector. Also, some 8-bit Fast
computers and Host adapters use a
50-pin connector.
Old Sun and DG computers use this 50-pin
connector.
Many 8-bit SCSI Fast (up to 10 Mbytes/sec)
computers and Host adapters use this 50-pin
high-density connector.
Apple/Mac and some older Sun 8-bit
workstations use this 25-pin connector.
All Fast/Wide (16-bit) SCSI-3 computers and
Host adapters, plus old DEC Single Ended
SCSI, use these 68-pin high-density connectors.

Low Voltage Differential (LVD)


With Single Ended SCSI, there is one wire per device. With Differential SCSI,
for any signal that is going to be sent across the bus, there are two wires to
carry it. The first wire carries the same type of signal that the Single Ended
SCSI carries, but the second wire in the pair carries its logical inversion. The
receiver takes the difference of the pair to determine the data. Since it is taking
the difference, you can now figure out the name Differential. This method
makes it less susceptible to noise and allows for a greater cable length.
Low Voltage Differential (LVD) will be finalized in the SCSI-3 specifications, but it will use less power than the current High Voltage Differential
(HVD). LVD is also less expensive than the current technology and will allow
for higher speeds when implemented with Ultra-2 SCSI. The data transfer rate
can be increased to 160 Mbytes/second with a cable length of 12 meters.

High Voltage Differential (HVD)


When you think of HVD, think older technology. HVD supports throughput
of 40 Mbytes/second at cable lengths of 25 meters. These cards and drives
are used in less than 5% of all implementations. They used to be used in
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

16

Chapter 1

Disk Subsystems

implementations where long cable runs were necessary, especially in noisy


areas. Now LVD can take the place of HVD and save money. HVD is not
compatible with other forms of SCSI devices.
To summarize, take a look at Table 1.2. This covers the types of SCSI.
TABLE 1.2

Different Types of SCSI


SCSI Type

Speed

Connector

SCSI-1
(AKA 8-bit or Narrow)

5 Mbytes/second

Either a 50-pin Centronics, which was standard on things like


scanners or tape
drives, or a DB-25, similar to what is found on
the Iomega Zip Drive

SCSI-2, Fast SCSI


(8-bit Narrow)

10 Mbytes/second

50-pin high-density,
used for things like
Iomega JAZ drives or
writable CD-ROMs.

Ultra SCSI
(8-bit Narrow)

20 Mbytes/second

50-pin high-density,
used for things like
Iomega JAZ drives or
writable CD-ROMs.

Wide SCSI
(16-bit Wide)

20 Mbytes/second

68-pin high-density,
used with hard disk
drives

Wide Ultra SCSI


(16-bit Wide)

40 Mbytes/second

68-pin high density,


used with hard disk
drives

Ultra 2 SCSI
(16-bit Wide)

80 Mbytes/second

68-pin high-density,
used with disk drives

Ultra160 SCSI
(16-bit Wide)

160 Mbytes/second

68-pin high density,


used with disk drives

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

SCSI Termination

17

So, now that you know all about the different types of SCSI, it is time to
ask yourself why it is important. First of all, it is extensible. If you have
worked around networking for any length of time at all, you know that there
is no such thing as too much disk space. If you run out of disk space, it is nice
to know that you can add on another drive, or group of drives, without much
hassle. Cost may be another thing, but without much hassle. There wont be
much hassle as long as you understand termination.

SCSI Termination

long, long time ago, when I first started playing with hardware, the
term SCSI was sometimes enough to bring fear and trepidation into the
hearts of the best of technicians, all because of a couple of small terminating
resistors.
Earlier we mentioned that a SCSI chain has to be terminated at both ends.
That sounds really easy and simple. Sometimes, when you have a combination
of several internal devices connected to several external devices, it is not the
easiest of jobs to locate the end of a chain. In addition, a small resistor that was
plugged into the device terminated some devices, jumpers or DIP switches terminated some devices, and sometimes it was a combination of the two. Some
devices had terminating resistors, which were large and silver and difficult to
lose. So, you always had to remember the basics of SCSI troubleshooting.
Problems are usually caused by termination. When in doubt, break down the
chain and add one device at a time until you find the device that is causing the
problem, or until you get the entire chain working. It could lead to a trying
day. Things have gotten better: Some devices now are self-terminatingthey
just sense if they are at the end of the chain and terminate themselves.
SCSI termination is just the electrical circuitry, which is installed at the end
of a cable that is designed to match impedances for the purpose of preventing
the reflection of electrical signals when they reach the end of the cable. In SCSI,
this is done with a device called a terminator.

When working with any SCSI bus segment, remember there should be
two terminators and only two terminators. Not one, not three, but two
terminators. Also, the terminators must be installed at the very ends of
the SCSI cable, not at devices in the middle of the bus.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

18

Chapter 1

Disk Subsystems

When you talk about SCSI termination, there are four basic types: Passive,
Active, Force Perfect Termination (FPT), and LVD (including LVD/MSE).
Lets explore them one by one:
Passive The simplest form of termination is referred to as Passive. The
terminator consists of a 220-ohm resistor that goes from the TERMPWR
to the signal line and another 330-ohm resistor that goes from the signal
line to ground. This form of termination does not cost much, but there are
disadvantages. For example, if there is a fluctuation in the TERMPWR
voltage, it will show up on the signal lines of the bus. That may be enough
to cause data errors. If your system is using SCSI-2, it is recommended
that you use Active terminators whenever possible for Single Ended SCSI.
Passive terminators are always used with differential (HVD) SCSI.
Active If the termination is not Passive, it must be taking an Active role.
Active termination is referred to as Alternative 2 in SCSI-2. Active termination was developed because of the problems with Passive termination.
To solve those problems, Active terminators have a voltage regulator.
This regulator serves to reduce the fluctuation effect down to practically
nothing. Active termination uses only a 110-ohm resistor, which is
installed from the regulator to the signal line. This provides a much closer
match to the normal impedance of a SCSI cable. This closer match means
a more stable signal, which creates less signal reflection and thus fewer
data errors.
Force Perfect Termination (FPT)
Although FPT is not recognized in
any of the SCSI specifications, it is a Single Ended termination method
that uses diode switching and biasing to make up for any impedance mismatches that exist between the SCSI cabling and the peripheral device,
whatever it may be. Since FPT is not part of the specifications, it should
not come as a surprise that there are several types of FPT and these different types may not be totally compatible. Also, by and large you can
assume that FPT only works and plays well with FPT.
Low Voltage Differential (LVD)
The terminator for LVD uses a form
of Active termination. This termination enhances the faster speeds and
lower power consumption than HVD. It works with Ultra 2 and Ultra 3
SCSI.
LVD/MSE Finally, there is what is referred to as LVD/MSE. This is
LVD that makes use of multimode transceivers. In the case of LVD/MSE,

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

SCSI IDs and LUNs

19

it checks the voltage level appearing on the DIFFSENSE pin of the cable.
By sensing the voltage level, the terminator knows to automatically
configure itself for LVD or for Single Ended. Most new SCSI designs
include these multimode transceivers.
So, now lets see how to put it into action. Take a look at Figure 1.5.
FIGURE 1.5

Cabling external SCSI devices

As you can see, the Host adapter and the last device in the chain are terminated. But what about those things called SCSI IDs?

SCSI IDs and LUNs

have mentioned several times that devices can be linked together in a


SCSI chain. That brings up the question, how does the SCSI controller keep
track of all these devices? Each SCSI device on a chain needs to have a unique
SCSI address, called the SCSI ID. These SCSI IDs are assigned in a variety of
ways, but usually by either a set of jumpers on internal devices or by a rotary
switch for external devices. On an 8-bit SCSI bus, the SCSI ID can be
between 0 and 7. If you are using a 16-bit SCSI bus, the IDs can be between
0 and 15, and for a 32-bit bus, the IDs can be between 0 and 31.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

20

Chapter 1

Disk Subsystems

Remember, SCSI IDs must be unique on the chain. You cannot have two
device 3s on the chain.

How do you choose which address to assign to which device? Lets look at an
example. To keep this simple, lets use an old 8-bit bus because then we dont
have so many numbers to work with. Suppose that we have our controller, three
hard drives, and a CD-ROM. If we are using a regular PC SCSI, we have an ID
range of from 0 to 7. Remember, we are geeks, and all geeks start counting at 0.
In this case, the controller would be set to ID 7, because the higher the number,
the higher the priority. As far as all the rest of the devices, it really doesnt matter
as long as the IDs are unique. For simplicity, we would probably address the
hard drives as 0, 1, and 2 and make the CD-ROM device 3. In this chain, there
would be a terminator on the controller and a terminator on the CD-ROM.

Usually, set the slowest device with the highest number, which will give it the
highest priority. Also, start your numbering at 0. When you boot your system,
the SCSI controller will attempt to contact each device in the chain, starting at
the lowest number. If you have numbered everything from 6 down, you are
going to spend a lot of time waiting for the controller to decide that devices 0
and 1 are not on the chain!

So, you know how to identify the device on the SCSI chain by giving it an
address. What if the device performs different functions, and there needs to
be a way to make that happen? That is where the Logical Unit Number
(LUN) comes into play. The LUN is a value that is used to identify a logical
unit of a SCSI device. According to the SCSI-2 specifications, there can be up
to eight logical units for each SCSI device address. These logical units are
numbered from 0 to 7. To give an example of how this might be used, think
of a tape drive that has a tape changer. In that case, the entire assembly may
have a SCSI ID of 0. The tape drive may have a LUN of 0 and the changer
may have a LUN of 1. Therefore, the actual SCSI address of the tape drive
would be ID 0, LUN 0.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

SCSI IDs and LUNs

21

Bus Length
How long can the SCSI chain grow? As I mentioned earlier, it depends on the
level of SCSI you are using. Check out Table 1.3. It should give you a good
idea of the numbers to keep in mind when working with SCSI.
TABLE 1.3

SCSI Bus Summary


Max
Bus
Length
in
Meters
(LVD)

Max
Bus
Length
in
Meters
(HVD)

Maximum
Number of
Devices
Supported

Bus Speed
MB/Sec
Maximum

Bus
Width
in Bits

Max
Bus
Length
in
Meters
(SE)

Narrow
SCSI-1

N/A

25

Narrow
Fast

10

N/A

25

Fast
Wide

20

16

N/A

25

16

Narrow
Ultra

20

1.5

N/A

25

Wide
Ultra

40

16

N/A

N/A

25

16

Wide
Ultra

40

16

N/A

N/A

Narrow
Ultra 2

40

N/A

12

25

Wide
Ultra 2

80

16

N/A

12

25

16

Ultra160

160

16

N/A

12

N/A

16

Ultra320

320

16

N/A

12

N/A

16

SCSI
Type

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

22

Chapter 1

Disk Subsystems

Servers and SCSI


So, why is all this stuff important? First of all, SCSI is exceptionally versatile.
As a matter of fact, in Minnesota I know of a factory where at one time the
entire manufacturing process was handled from one PC, linked to all the
machines through SCSI interfaces. Now I doubt seriously that you want your
server to handle an entire manufacturing plant, but you probably will want to
take advantage of the way you can add specialty devices like tape backup units
or CD-ROM towers to your network.

Real World Scenario


If you ever get involved in installing SCSI devices, you will find that things can
get confusing quickly. Assigning each drive its own address and making sure
that there are only two terminators on any SCSI chain sounds like such an
easy thing to do, but sometimes, when you are up to your elbows in server
with drives lying all around and jumpers sitting here and there, it can get to
be somewhat out of control.
This is where the KISS rule comes into play. The KISS rule says, Keep It
Simple Stupid. When it is applied to SCSI, it means to take it one step at
a time. If things are not working the way they should, just put one device
into the chain at a time, and get the chain working properly each step
along the way. You will be up and running in no time. The problem with
this little tip is that it requires the technician to have patience, and that is
not usually one of our strongest traits.
When you are dealing with SCSI, especially new implementations, if you
are having problems, think termination and addressing. Those are the most
common problems.

RAID

nother way that SCSI plays an important part for your server is in the
way it can be used to provide redundancy and increased performance. Much
of that is done with a technology called Redundant Array of Independent
Disks (RAID).

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

RAID

23

In every book I have ever written for Sybex, I have mentioned the gee-whiz factor of computers and of networking. The gee-whiz factor works like this: I understand how it works, I know why it works, I even know how to make it work, but
the fact that it works the way it does still amazes me. Now, I admit, I am easily
amused. But when I think about routing, I am truly amazed. When I think about
the elegant simplicity of Domain Name Service (DNS), I am amazed. But I am
really in awe of RAID technology. The upper levels of RAID are seriously impressive. So, this is what we are going to talk about in this section.

Definition of Terms
I know, I know, there is a whole glossary in the back of the book dedicated
to defining terms. I also know that there are terms that I am going to be
throwing out over the next several pages that we should come to some sort
of a common understanding about. Not that you would fail to take the time
to look in the back of the book to see what they meanthat would never
happen.
So, when we start talking about things like RAID, we start using terms
like high availability or fault tolerance. High availability means just what it
sounds like, making sure that the resources your server provides are available a high percentage of the time. Fault tolerance means that if something
breaks, there is something else there to pick up for the broken part, and
things go on as if nothing happened.
Another term we should look at is the phrase single point of failure. A
friend of mine says that you can tell the skill level of a network administrator
by her level of paranoia. The really paranoid ones are the ones who have
been around the block and understand that the question is not if something
is going to go wrong, but when. They also know that no matter how bad they
think it can get, it can get worse. In any computer system, there are components that can go bad. The reliability factor is getting much better, but stuff
still does happen. You are trying to increase your odds, so that when things
do go bad, you are covered. You know that certain components in a system
have a higher chance of failure than others. For example, it is much more
likely that a printer is out of paper or is offline than that the mainboard in
the printer has gone bad. So, by looking at where our single point of failure
is, we are hedging our bets.
Here is a brief example. If I provide a level of disk drive protection called
mirroring, it means that two disks are hooked to a single controller. Everything that is written to one of the disks is written to the other disk. If one of
the disks goes bad, the other disk is there to take over for it, and we have
fault tolerance and higher availability. We have, in effect, moved our single
point of failure from the hard disk back to the disk controller. You can move
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

24

Chapter 1

Disk Subsystems

the single point of failure back farther than that, but that would be stealing
my thunder for the section on duplexing.
Next, there is the subject of data striping. With data striping, instead of
taking bits of information and storing it on disk, you are taking bits of information and storing it across several disks. In this way, write heads on several
disks are being utilized and performance increases dramatically. Unfortunately, there is no fault tolerance, so if any one of the disks in the stripe set
goes bad, the entire set is dead in the water. This is not necessarily a good
thing, so there is something called striping with parity.
Finally, we get to the subject of parity. What follows is a highly simplistic
explanation of parity, but it should give you an idea of how it works. First,
take a look at Figure 1.6.
FIGURE 1.6

Disk striped with parity

Disk 1:
30GB

Disk 2:
30GB

Disk 3:
30GB

Disk 4:
30GB

Disk 5:
Parity storage

Assume that each one of the first four disks is 30GB each. The fifth 30GB
drive is not used to store datait is just used to store the mathematical sum
of the information striped across the first four drives.
So, now we are going to save a file called RESUME.DOC to the striped set of
drives with parity. In this case, lets assume that the first block of data can be represented in binary as 1010. That means that a 1 would be written to Drive 1, a
0 written to Drive 2, a 1 written to Drive 3, and a 0 written to Drive 4. Finally,
the sum of 1+0+1+0 or 10 (remember, we are dealing in binary here, and in
binary 2 is represented as 10) would be written to Drive 5. Parity is defined simply as the quality of sameness or being equivalent. With RAID we are using parity not only to check to make sure things are the same, but we are also using it
to rebuild things that may have been damaged. Take a look at Figure 1.7 to see
what I mean.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

RAID

FIGURE 1.7

25

Disks striped with parity and data on the drives

10

Finally, it is the middle of thunderstorm season, and you take a massive


lightning strike. The only thing that gets damaged is Disk 4, and it is pretty
much finished. See what I mean in Figure 1.8.
FIGURE 1.8

Disk striping with parity and a dead drive

10

Because you have instituted parity, life can go on without anyone being
the wiser. If someone wants to access the file RESUME.DOC, the system recalls
the file and loads the 1+0+1. It knows there is supposed to be something in
place of Disk 4, but since it cant find it, it reads the parity sum of 10 and
knows that all that is missing is another 0. The system can continue functioning until you get another drive installed. Hopefully, there will not be any
more thunderstorms!
So now that we have terms defined, we can talk about RAID.

The Levels of RAID


We are going to talk about seven of the more commonly defined levels of
RAID. Some of these may already be in place on your server, or none of these
may be in your servers, and you may have different implementations in the
same server.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

26

Chapter 1

Disk Subsystems

RAID 0
This is disk striping without parity. Of all the RAID technologies, RAID 0 is
the fastest because the write heads are constantly being used with duplicate
data being written, or without any parity being figured. With this system your
server will have multiple disks and the information is striped across the disks
in blocks without parity. There is no fault tolerance in a RAID 0 system.

RAID 1
This level is commonly referred to a disk mirroring, or disk duplexing. In either
case, there are two hard disks involved and anything that is written to one of the
hard disks is written to the other. In the case of disk mirroring, there is just one
disk controller, so the controller is the single point of failure. Disk duplexing
adds a second controller to the second disk, moving the single point of failure
away from the disk subsystem to the mainboard. In RAID 1, if either disk fails,
the other disk takes over. There is no parity or error checking information
stored. If both drives fail, new drives must be installed and data restored from
backup.
The disadvantage of RAID 1 is cost per megabyte. If you have two drives
that have a published capacity of 30GB each, and they are mirrored or
duplexed, the total amount of usable disk space is 30GB, not the 60GB you
purchased. If you are using different-sized drives, the mirror will reflect the
storage capacity of the smallest drive.

RAID 5
In this case, data and parity information is striped at block level across all of
the drives in the chain. Again, RAID 5 takes advantage of the faster disk
reads and writes. The parity information for data on one disk is stored with
data on another disk, so if any one of the disks fails, it can be replaced and
the data can be rebuilt from the parity data stored on the other drives. Again,
it requires a minimum of three drives, but usually five or more disks are used.
The disadvantage here is that the controllers are becoming expensive.
Reasons to Use RAID 5
When you start talking about RAID 5 and higher levels, the hardware
controller can become something of an issue. This can cause the price
point of the implementation to climb. Obviously, you are going to use it
on mission critical servers like these:


Enterprise critical file and application servers

Database servers

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

RAID

Web, e-mail, and news servers

Intranet servers

27

RAID 0+1
Now we start getting into some of the hybrid approaches. If you look at the name,
and you understand what RAID 0 and what RAID 1 do, you have a pretty good
idea of what RAID 0+1 is. You know that RAID 0 is disk striping without parity.
You know that RAID 1 is mirroring or duplexing of disks. So, RAID 0+1 is where
an entire stripe set without parity is actually mirrored or duplexed. There will be
a giant performance improvement on disk reads, there will be some performance
hits on disk writes. Data will survive the loss of multiple disks, but the monetary
cost can be high.

RAID Setup, Uses, and Recovery


We are about to start bouncing around some of the objectives, so be prepared.
We are going to look at the differences between Hardware RAID and Software
RAID, and what happens when a drive fails. That subject is going to introduce
some more terms to remember, things like hot swap, hot plug, fail over, and
hot spare.
One of the things that any really good server administrator takes into consideration is what happens when things go wrong. Now, most of the levels
of RAID that we have talked about have built-in levels of redundancy somewhere. There is only one level (RAID 0) where the loss of a single drive will
result in data loss. In all the other types, losing a single drive, or even losing
multiple drives, will not result in down time. So how is RAID implemented
and what happens in each case when a drive fails?

Software RAID
RAID 0 and RAID 1 are usually defined at the software level. In this case, it is
the server operating system that determines the RAID level and the level of protection. In Windows NT/2000 it can be called Drive Striping, Drive Striping
with Parity, or Mirroring. In NetWare it may be called Mirroring, but the result
is same. Somewhere there is a tool or utility that will allow you to either stripe
a drive and provide parity or mirror the drives.
The advantage of using Software RAID is low cost. There are no special
controllers to buy. The operating system will recognize the drives and provide
the level of protection that you define.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

28

Chapter 1

Disk Subsystems

Hardware RAID
In some of the more complex implementations of RAID, a special controller or
special disks need to be linked together. When you start mentioning the word
special, the dollar signs usually start to light up. It will be up to the controller
to define the level and type of RAID.
Because you are dealing with hardware rather than software, your
performance will increase.

Hot, Hot, Hot

ou have hot swaps, hot plugs, and hot spares. What in the world is
the difference and how do they work? It is all a matter of degree!

Hot Spare
A drive is considered a hot spare if you happen to have an extra drive sitting on
the shelf that matches the type and configuration of the drives in your server. For
example, if you have a SCSI-2 Ultra, 9GB Seagate on the shelf waiting in case of
emergency, that would be considered a hot spare. When the hot spare gets put
into play, it could be a hot plug or a hot swappable drive.
Here is an example of how a hot spare would be used. Say your server has
RAID 1level mirroring defined. The first drive in the mirrored pair has
failed, for no other reason than drives go bad, and the system has failed over
to the second drive in the mirror. In this case, the system keeps on working
like nothing has happened. You notice the fail over (the fact the first drive
failed and the second took over) and make plans to replace the failed drive
with a hot spare when the server can be taken out of service with a minimum
amount of interruption to the normal workday. When you can down the
server, you shut it off, replace the failed drive with the hot spare, and bring
the server back online. Once the server is online, you can then use the appropriate tool to reestablish the mirror, and the new drive will be mirrored to
match the old.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Hot, Hot, Hot

29

Hot Plug
With a hot plug drive, the server does not have to be brought down or taken
out of service to install a new drive. In the case of a hot pluggable drive, you
are not replacing a current drive that has failedyou are adding disk space to
the mix. In the case of a hot pluggable drive, you open a cabinet, plug the drive
into the backplane of the cabinet, and the operating system should recognize
the drive is there. Depending on the operating system, you will have to create
a partition and a volume to make the drive available.

Hot Swap
This is one of those gee whiz things we talked about earlier in the chapter.
I remember the first time I had to hot swap a drive in a RAID array with parity.
When I asked a senior tech how to do it, he smiled and said, Open, pull, push,
watch, and be amazed. I got to the client site with the drive in hand, and went
to the server room. There was a large disk array of seven drives in a cabinet
with a glass door, all sorts of flashing lights next to six of the drives, and a
series of steady red lights next to the drive that had died. It didnt take a rocket
scientist to figure out which drive had failed. So I opened the glass door and
saw the two small rocker arms holding the bad drive in place. I moved those
out of the way, grabbed the handle on the front of the bad drive, pulled, and
the drive came out in my hand. I took the new drive, pushed it gently into the
slot until I felt it lock, and then put the rocker arms back in place. Once that
was done, the lights next to new drive started going crazy, while the drive was
automatically rebuilt from the other drives in the set.
It was seriously cool! No one on the network had any clue that a drive had
ever failed! No data was lost, no time was lost, and the server was never
unavailable.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

30

Chapter 1

Disk Subsystems

We probably should have gone over this earlier, but we didnt! As you read
through many of these chapters, keep in mind that there are several laws of
network computing that come into play. Some of these are documented,
some are only figments of my imagination, but they are important to remember just the same. Here are some of my favorites: Williamss Law: You can tell
the skill level of a network administrator by his level of paranoia. The really
good ones are really paranoid. Murphys Law: Anything that can go wrong,
will go wrong, at the worst possible moment. Govanuss Law: The chance of
completing any network upgrade successfully is inversely proportional to the
visibility of the project and the proximity of your annual review. If you are
about to undertake a project that will affect everyone on your network, and it
is the night before your annual review, please be sure to be carrying a copy of
your resume on a disk in your pocket. You may not make it back to your desk.
Finally, my favorite, and this one has proven true worldwide: End users lie.
Network administrators are the best end users.

Real World Scenario


The concept of hot plug, hot swap and hot spares is certainly something
you should take into account if your budget will allow. We are all striving for
the longest possible uptime, though we all know that stuff will happen. Any
time you can cover yourself and your key components without having to
down the server to make a repair, your users come out ahead.
When you look at the key components that are most likely to suffer a failure,
most can fall into the hot plug, hot swap category, and all it takes to make
that happen is money! Not a problem, right? As a consultant, I am routinely
called in to explain why the increased costs for a specific product are worth
the money. I tend to use an approach I learned a long time ago when I was
a salesperson. I try to talk the managers language. I use terms like cost of
operation. Managers know what it costs to operate a department or a company down to the penny. I can take those figures and extrapolate them to
help someone make a decision. For example, if it costs the company $1,000
an hour to operate your department, and if we agree that it will take a minimum of four hours to change out a key component, the cost to the company in downtime is $4,000. If there is an added cost of $500 to make this
device hot swappable so there will be no loss of service, we can show where
it is a sensible decision to spend the extra money.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Fault Tolerance

31

Fault Tolerance

Fault tolerance is the act of protecting your computing gear, whether that
gear be infrastructure-oriented as in switches and routers or computer-oriented
as in servers and disk farms. In either case the fundamental question you ask
yourself is this: How can I protect the equipment so that a fault of some kind
doesnt interrupt service? Impairment of service might be tolerable: interruption
is not.
We talk about uptime of devices and services in terms of 9s. We assume
that you want to keep your gear up 99% of the timethats a given. But as
we add 9s to the other side of the decimal point, the time that a device is
allowed to be down, including maintenance windows, becomes increasingly
smaller. Five nines uptime equates to an allowance of about 4 hours
downtime per year, including maintenance windows. Your goal with fault
tolerance methodologies is to increase the number of 9s that are on the right
side of the decimal point. Five nines is optimal, but not realistic in most situationsfour nines is a better goal. Well talk about how to realize these
goals in this chapter section.

Configuring RAID
One thing you can do is use RAID to help augment your system uptime.
Either RAID 1 (disk mirroring) or RAID 5 (disk striping with parity) will be
beneficial to you in terms of bringing fault tolerance to your servers.
Some network operating system software allows you to set up RAID configurations without having to purchase special hardware RAID array controller cards (theyre expensive). I have worked with software RAID and dont
think it works very well. I much prefer hardware solutions. For starters the
card has its own processor and memory and can really go a long way in offloading the central CPU from having so much work to do. With software
RAID the CPU handles everything. Also, the software solution seems to not be
as reliable as the hardware-based solutionthough it may have been my fault
configuring the software rather than how the software behaved. Whatever the
reason, Im saying that when you consider RAID implementations, you should
pay the extra $2K or so and get the RAID controller card with the system.
Another important point about hardware RAID is that there is always
some data in the cards memory. If you had an ungraceful shutdown on the
server while there was some data in that card, it would be lost. Thus its
important to purchase your RAID array controller cards with a battery

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

32

Chapter 1

Disk Subsystems

backup so that in the event an instantaneous down happens to the server, the
data will be safe for a time until you can bring the server back up. Keep in
mind the data is being held there by a battery, so you dont have days or anything like that, but you do have some cushion you can work with.
Youll usually opt for either a mirroring or a striping-with-parity scenario
for a given set of drives. You can have both kinds in your system without
encountering any difficulty at all. As a general rule of thumb, you usually
want your OS to be on a mirrored set of disks while your data will live on a
RAID 5 volume. Some NOS software wont work on a RAID 5 volume at all.
Youll typically configure the RAID volumes through either a BIOS
interface at the cards boot time or through a configuration CD that
comes with the server. HP, for example, includes a wizard-like interface
that you can use to configure the entire box, including the RAID array.
Watch the BIOS messages at boot time and youll be given the key
sequence to enter so that you can access the cards BIOS.
Theres also the concept of a RAID 10 where you configure two separate
drive cages with RAID 5 arrays and then mirror the arrays. Youve got
double fault tolerance because if the first array has two drive failures, you
can break the mirror and work on the second drive array until you get the
first one fixed.
Realize that just because the systems on RAID doesnt necessarily mean
itll never have to be taken down. RAID helps safeguard systems so that they
can keep working until users go home and you have a chance to down the
computer and make repairs after hours. You want to avoid downing servers
during working hours.

Adding Drives to the Array


Adding drives to the array is very easy to do. First you physically add the
drive, then you access the arrays BIOS (or use the configuration CD) and
add the drive to the volume that you want it to be a member of.
There is one minor thing to think about and that is that the arrays must
use the same amount of drive space off of each drive in the system. So if you
have an array that consists of three 4.3GB hard drives and two 5.7GB drives,
the array will set up a RAID 5 volume that uses five 4.3GB pieces of space,
one from each disk. Thatll leave 1.4GB for you to do something else with.
The RAID card will show that space hanging out there and available elsewhere. You cannot include disparate lengths of data in a RAID 1 or RAID
5 volume. You can opt to configure the leftover space anyway you like
whether its one unprotected volume of space, a mirror, or however you

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Fault Tolerance

33

want to configure it. Keep this in mind as you add disks to the system. You
cannot add a hard drive that is smaller than the current array is expecting
and be able to configure a volume. You must provide as large a disk or larger
in order to facilitate the addition. If you have some left over, its up to you
to configure the extra space as you see fit.
Lastly, remember the n-1 rule with RAID 5. You take the number of disks
youre going to dedicate to the RAID 5 array and subtract 1 from that number
to account for the space needed for the parity stripe. Thus if you have six
17GB hard drives youre putting in an array, youll really only wind up with
5 * 17GBs worth of data because you sacrifice one disks worth of space for the
parity stripe. Actually the stripe is usually sent across all the disks so youre really
not dedicating one disk to parity stripe, though there are RAID implementations
that will allow you to do such a thing. With RAID 5 then, more disks means that
you attain more actual disk storage space and dont sacrifice as much space to
parity striping. More is more in the case of RAID 5.

Replacing Existing Drives


Youre usually alerted to hard drive problems in RAID arrays through events
being posted to log files in the NOS system logs. Typically its incumbent
upon you to set up the software needed to perform this monitoring of drives
and posting to logs, so be sure you read your system documentation.
If a drive goes bad in a RAID 1 or RAID 5 array, generally you live with the
problem until you can obtain a like replacement and then shut the machine
down after hours to fix the problem. You then perform a graceful shutdown,
replace the drive, and then go into the array controller cards configuration
utilities (through the cards BIOS or the configuration CD) to rebuild the stripe
or rebuild the mirror, depending on which failed. Its a very straightforward
process.

Performing an Upgrade Checklist


As always, when youre performing replacements or adding disks to an
array, its wise to take a little bit of time to plan out your activities and jot
them down in a checklist. Be cognizant of the need for additional drivers or
software. Be sure to read FAQs and readme docs before proceeding. Its not
a bad idea to have a lab environment where you can test some of these kinds
of activities before you ever have to do them in real life, thus giving yourself
some practice opportunities.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

34

Chapter 1

Disk Subsystems

Planning for Redundancy


The key word to keep in mind when youre considering highly available,
highly fault tolerant implementations is redundancy. Thats how the phone
company does itthats how you have to do it if youre looking for five nines
uptime.
When you order servers and you have an eye toward redundancy, there
are many key places to look:
Hard drives You can consider multiple hard drives in a single RAID
environment, multiple RAID environments, and hot spares.
Power supplies Consider purchasing multiple power supplies with servers. If a single power supply goes, you have a redundant unit that can keep
the server going.
Cooling fans Server cooling is of paramount importance. Consider
multiple cooling fans in servers that you order.
Network Interface Cards Dual-homing or multiple-homing a server by
adding a second or even more NICs provides a way for you to add redundancy to the server. If a single NIC fails, users can still get in because
theres a second one in the box. Dual-homing can also be used to provide
a server path for multiple IP segments.
Multiple processors Adding more than one processor to a computer
allows for symmetric multiprocessing (SMP)aware applications to be able to
utilize all of the processors in the system. It also provides redundancy in the
event any one processor fails.
Uninterruptible Power Supply Hooking your servers up to a UPS,
whether its a stand-alone UPS designed to keep just one server up or a big
room-size unit that can handle hundreds of servers. If the power goes out
in your building, the UPS will hold the servers until you can perform a
graceful shutdown.
Generator For long-term outages many companies have a fossil fuel
powered generator that can start up and continue to supply power to the
computer room until power is restored.
Clustered servers Having two computers connected to a single data
source and running cluster software allows for the second computer to pick
up and continue operations in case of the outright failure of one computer.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Fault Tolerance

35

Planning for fault tolerance is all about redundancy. You should consider
applying redundancy in any of your mission-critical servers. A little bit of
money spent now can save countless hours of downtime later on.

Using the Technique of Hot Swap


As a busy server administrator youll doubtless run into times when you have
a hard drive fail in a RAID array and you have to replace the drive. This is
called swapping the drive (go figure). There are three kinds of swapping:
Cold swap A cold swap means that you have to actually power down the
computer in order to perform the swap. Software-based RAID 5 would put
this kind of requirement on you. Youd go into the server logs, see that the
RAID software was alerting you that a drive had gone offline, and then
youd have to schedule maintenance time to down the server and replace the
drive. Once the new drive was installed and the server brought up, the
RAID software may have to be manually put into rebuild mode or it may
automatically rebuild the stripe for you.
Warm swap Warm swap is in between a hot swap and a cold swap. In
a typical warm swap you dont have to power down the computer, but
you may have to instruct the RAID controller to stop allowing I/O transactions to the array while you change out the drive. The controller does
this by simply placing any requests for I/O on hold until the drive has
been replaced and is ready for operation. The advantage is that you
avoid delays that are incurred by drive spin-up, controller boot-up, and
negotiation with the CPU. Youll typically see warm swap scenarios in
PCI-based RAID systems.
Hot swap Hot swap allows you to replace a faulty hard drive without
having to power down the computer or introduce a pause to I/O requests
(with the exception of disallowing attempts to write to the bad component, that is). You can replace the drive and bring it online without very
little intervention with the other working components and no disruption
to service. Hot swap technology ensures high availability in RAID arrays
and, best of all, allows admins to service the unit during working hours
without having to schedule an after hours downtime period for the computer. Be advised that its still advisable to perform this kind of work after
hours because you just dont know what kind of effect youll have on the
server. But its good to know that in a Web server that has to be up
24x7x365, for example, a hot swap would allow you to replace a hard
drive without users seeing a blip on the radar screen.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

36

Chapter 1

Disk Subsystems

Theres one other technique thats often useda hot spare. In a hot spare
situation, you keep a spare drive in the computers drive bay. You configure
the RAID array controller to treat the drive as a spare. When data is written
to the stripe, the drive is included as a backup drive. If one of the main drives
in the array fails, you can utilize the hot spare to act as a fallback. Hot spares
are handy because you simply have to go into the RAID configuration utility
and tell it to begin using the hot spare. The downside is that you burn a hard
drive you wouldnt ordinarily have to use.
All of the above systems provide high-availability scenarios in the case of
a single drive failure. Keep in mind that two or more drives failing means the
end of one array.

Creating a Disaster Recovery Plan


As an admin, you should, sooner rather than later, ask yourself this question: If a disaster of incredible significance hit this buildingi.e., a flood,
earthquake, explosion, or other cataclysmic eventhow would I restore
the operations of the business back to normal? Furthermore, how soon
could I have them restored?
This is the question that Disaster Recovery (DR) asks and presents the
problem that you must solve. At first the DR questions are intimidating.
But DR plans are like eating an elephantone bite at a time. To facilitate
a good DR plan you must examine several elements of your network
design:


How reliable are my backups?

Am I storing my backup tapes offsite?

Have I performed a test restoration from one of my backup tapes?

Do I have adequate replacement computers and network gear to


rebuild the current facility?

Do I have an offsite DR facility that I can roll the business to in the


event of a disaster?

How many hours or days behind can my business be in the event of a


catastrophic event and subsequent DR restoration of service?

One thing that should be obvious from looking at the above list is that you
cannot make DR decisions alone. Clearly youll need to solicit the advice and
interaction of others in order to facilitate a robust DR plan.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Fault Tolerance

37

Remember the basic concepts of DR: fault tolerance, the ability to gracefully recover from a fault, and redundancy. If, for example, your business
runs entirely off of Web activity, then your Web servers are of paramount
importance to you. So much so that you cannot afford for them to go down.
In such a case, a DR plan might include the following components:


Redundant ISP to act as a backup to the primary ISP

A multiplicity of Web servers all running some sort of Network


Load Balancing (NLB) software so that not any one server is used
inordinately more than the others

Redundant WAN links to your ISPs

Redundant DNS servers for name-server backup

Security software that monitors for hacker, denial-of-service attacks,


and other unscrupulous events

Primary, secondary, and tertiary backup administrators for the Web


servers

Clustered computers that can allow for the failure of any one server

Hopefully you get the idea. DR means that you provide an offsite place
where a redundant copy of your operation can live in case the first instance of
your operation somehow gets annihilated. Redundancy means that you build
fault tolerance into the feature set so that you avoid annoying little failures that
have the capability of driving the entire enterprise to its knees. You put this all
down in writing in a DR plan and then you periodically test the plan to make
sure it works for todays operations.

Identifying Hot and Cold Sites


In DR terms, a hot site is your primary operational site and a cold site is the
backup site that will become hot in the event something catastrophic happens
to the first site.
DR is expensive. Thats because you have to supply redundant everything
from servers to peripherals to methodologies to sites. In highly mission-critical
computing scenarios where the servers just cannot go down (think phone company, think stock exchange, think national security), you must think about
ways to provide a secondary location that can very quickly become available
in the event the first site becomes suddenly (and most likely permanently)
unavailable.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

38

Chapter 1

Disk Subsystems

Mainframe computing centers have long relied on redundant cold sites


that maintain exact copies of the primary computing environment. And
they rigorously test their DR methodology using the cold site as the fallback site for operations. There are companies that function as DR cold site
facilities for various companies around the world. In the server arena, site
DR is something that people are just now beginning to talk about.
Suppose you have an e-mail server that cannot go down. You might think
about introducing a cluster server scenario where one of the servers in the cluster lives in your primary building, but the other one lives offsite in a different
building, perhaps not even a building you own. There is some sort of a highspeed copper or fiber-optic link between the servers so that the secondary one
is aware of the primary one at all times and can pick up operations in case the
first one fails. This is a very high-level idea that requires much more intensely
technical understanding of cluster and database technologies, but you see the
point. You have a cold site thats all prepped and ready to begin operation
should the hot site fail.
Keep in mind that cold sites are designed to work temporarily. Typically
theyre not designed with permanence in mind. The thinking is that youll
eventually fix the hot site and restore it back to normal operations.

Summary

o, what we have done here is give you some protection against Murphys Law, and a way to prove that you match Williamss definition of a
really good network administrator. There is more to be done on this front,
but that about takes care of the disk subsystem. First we are going to do some
review, and then in Chapter 2, IDE Devices, we will be looking at clustering, Fibre Channels, CPUs, and multiprocessing.
We also talked about fault tolerance and all of its nuances. There is one
basic notion that comes into play when we think about fault tolerance:
redundancy. The acronym RAID, for example, stands for Redundant
Array of Inexpensive (or Independent) Drives. You use redundancy to
build in high-availability. A hot spare drive is one that sits in the drive cage
and can be put into play by tweaking the RAID utility. Hot swap capability
means you can change out a hard drive without any interruption to users.
Warm swap means you can change out the drive in an array, but you have
to disrupt I/O requests long enough to get the drive replaced. You avoid the

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Summary

39

drive spin-up, controller boot, and other associated nuisances of a cold


boot, but users do suffer a lapse in servicea bad no-no in fault-tolerance
terms. A DR plan details how youll put your companys computing back
in business after a disaster of some kind occurspristine as it was the day
before the disaster happens. To do this youll denote a hot and a cold site.

Exam Essentials
Know the difference between logical drive and physical drive A physical drive can contain multiple logical drives but a logical drive will usually reside on one physical drive.
Know the different levels of raid, and what makes each level unique
RAID 0 is disk striping without parity, RAID 1 is disk mirroring or
duplexing, RAID 5 has data and parity information striped at the block
level across the drives, and RAID 0+1 is where a disk array that has been
striped without parity is also duplexed or mirrored.
Know which levels of scsi can interoperate without an adapter and which
levels will require an adapter SCSI, SCSI-2, and Ultra SCSI all use a
50-pin connector that is interchangeable. Wide SCSI, Wide Ultra SCSI,
Ultra 2, and Ultra 160 use a 68-pin connector.
Know the appropriate length of the various scsi cables SCSI is 6
meters, Fast SCSI is 3 meters, and Ultra SCSI is 1.5 meters with more than
five devices. If there are less than five devices, then the cable can also be
3 meters in length.
Be comfortable with the differences between hot plug, hot spare, and hot
swap. A hot spare is a device that is waiting to be put into the machine.
The other two choices are very close in meaning: A hot pluggable device
is one that can be installed while the computer or server is turned on, and
a hot swappable device is one where the device can be removed and
replaced and the server will experience no loss of service. For example, a
single network card can be hot pluggable. Drives in a RAID 5 array can
be hot swappable. If one of the drives fails, it can be removed and
replaced, and the data can be rebuilt on the fly without any loss of service.
Know how to configure drives. Be able to add or change drives in an
array and configure accordingly.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

40

Chapter 1

Disk Subsystems

Key Terms
Before you take the exam, be certain you are familiar with the following terms:
Active
American National Standards Institute (ANSI)
asynchronous mode
data striping
disk duplexing
Domain Name Service (DNS)
Fast SCSI
Fast-Wide SCSI
fault tolerance
Force Perfect Termination (FPT)
high availability
High Voltage Differential (HVD)
hot spare
jumpers
Logical Drive
Low Voltage Differential (LVD)
LVD/MSE
Mega Transfer (MT)
mirroring
parity
Passive
Physical Drive
RAID 0
RAID 0+1
RAID 1
RAID 5

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Summary

Redundant Array of Independent Disks (RAID)


SCSI Fast-20
SCSI Fast-40
SCSI Narrow
SCSI-2
Single Ended signaling
Small Computer System Interface (SCSI)
termination
Ultra 2 SCSI
Ultra SCSI

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

41

42

Chapter 1

Disk Subsystems

Review Questions
1. What is the width of the data transfer bus of SCSI-1?
A. A nibble
B. 4 bytes
C. 8 bytes
D. 8 bits
E. 16 bits
2. SCSI-1 is also referred to as which of the following?
A. Narrow
B. Slow
C. Fast
D. Wide
E. Ultra
F. Ultra 2
G. Narrow, Fast, and Wide
3. Choose one type of connector used in SCSI-1.
A. 9-pin serial
B. 15-pin serial
C. 25-pin Centronics
D. 50-pin Centronics
4. RAID stands for which of the following:
A. Redoubtful Array of Inexpensive Diskettes
B. Redundant Array of Inexpensive Disks
C. Redundant Array of Independent Disks
D. A SWAT team action

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

43

5. Which is not an ANSI standard?


A. SCSI-1
B. SCSI-2
C. SCSI-3
D. All of the above
6. Horace is the network administrator for a large dot-com. One day the

hard drive in one of the Web servers fails. The server is running hardware
SCSI-based RAID. What kind of drive changeout can Horace most likely
perform?
A. Cold swap
B. Warm swap
C. Hot swap
D. Hot spare
7. Ultra and Ultra 2 are examples of which of the following:
A. RAID 10
B. SCSI Bus Width
C. Physical Drives
D. SCSI Bus Speed
8. With LVD SCSI, how many wires will be dedicated to carrying the signal

across the bus?


A. 1
B. 3
C. 2
D. 4
E. 8
F. 16

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

44

Chapter 1

Disk Subsystems

9. What is the maximum Bus Speed of SCSI-2?


A. 5.0 to 20.0 Mbytes/second for a 16-bit data bus
B. 10.0 to 15 MT/second for an 8-bit bus
C. 2.0 to 4.5 Mbytes/second for a 16-bit data bus
D. 40 to 60 MT/second for a normal 8-bit bus
10. You are examining the possibility of putting a SCSI-2 controller in

your server. You still need to keep a device that works with normal
SCSI. Is it possible to run the SCSI device from the SCSI-2 controller?
A. No, SCSI-2 is not backwardly compatible to SCSI.
B. No, SCSI is Single Ended, and all SCSI-2 is HVD.
C. Yes, SCSI-2 is backwardly compatible with SCSI.
D. Check proper termination.
11. What is the maximum number of devices that can be part of a SCSI-3 bus?
A. 16 devices
B. 8 devices
C. 7 devices
D. 14 devices
E. 15 devices
12. What is another name for SCSI Ultra?
A. SCSI Wide
B. SCSI Fast and Wide
C. SCSI 20
D. SCSI 40
E. SCSI Fast 20

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

45

13. How does Single Ended SCSI work?


A. For any signal that is going to be sent across the bus, there are two

wires to carry it. Both wires can carry the signal.


B. For any signal that is going to be sent across the bus, there are two

wires to carry it. One wire will carry the signal and the other wire
will carry a defining voltage.
C. For any signal that is going to be sent across the bus, there are two

wires to carry it. One wire will carry the signal and the other will
be ground.
D. For any signal that is going to be sent across the bus, there are two

wires to carry it. Both wires alternate with the signal.


14. Choose one way in which RAID 3 differs from RAID 1.
A. With RAID 3, the data is always written with two types of parity.
B. With RAID 3, the data is striped in bits.
C. With RAID 3, the data is striped in blocks.
D. With RAID 3, the data is striped in bytes.
15. How does RAID 5 write data to the disk?
A. In bits
B. In nibbles
C. In bytes
D. In binary
E. In blocks
16. Can a hot swappable drive also be a hot spare?
A. Yes.
B. No.
C. It depends on the level of RAID.
D. It depends on the speed of the SCSI devices.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

46

Chapter 1

Disk Subsystems

17. You have a SCSI controller in your server and now you wish to add

some new SCSI-2 drives to this system. Can you do so?


A. Yes, but you will have to buy a new controller.
B. Yes, SCSI and SCSI-2 devices are compatible.
C. No, SCSI-2 is not compatible with the earlier SCSI specification.
18. Which of the following solutions combines data striping across drives

with mirroring?
A. RAID 1+5
B. Hybrid RAID 5+
C. High Performance RAID
D. RAID 0+1
19. Which of the following SCSI standards can use a 68-pin connector and

therefore will be interchangeable?


A. SCSI
B. SCSI-2
C. Wide SCSI
D. Wide Ultra SCSI
E. Ultra 2
F. Ultra 160
G. Ultra SCSI

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

47

20. Your boss has asked you to implement Hardware Level RAID because

he understands that it is more reliable. He wants your opinion. What


will you tell him? Select two.
A. It is not necessarily more reliable but it does provide better

performance.
B. Software Level RAID is less expensive but provides better performance

because the process is controlled by the network operating system.


C. Hardware RAID will cost more because he will have to purchase

a special controller and disks.


D. Software RAID will end up costing more because a special version

of the operating system has to be purchased.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

48

Chapter 1

Disk Subsystems

Answers to Review Questions


1. D. The data transfer bus in SCSI-1 is 8 bits.
2. A, B. SCSI-1 is also referred to as SCSI Narrow or SCSI Slow.
3. D. SCSI-1 uses a 50-pin Centronics cable.
4. C. If you answered B, you are showing your age. When RAID tech-

nology was first introduced, it was described as a Redundant Array


of Inexpensive Disks. Now, however, it has been changed to the
Redundant Array of Independent Disks.
5. C. At the writing of this book, SCSI-3 was still a proposed standard.
6. C. Because the system is SCSI-based, it likely supports the use of hot

swap. If Horace has an extra drive sitting around (one that has the slide
rails used for his computers drive cage), all he has to do is pop the old
drive out, put the new one in, and the RAID controller should automatically take over. Some older controllers require you to manually begin
the array rebuild.
7. D. Ultra and Ultra 2 are examples of Bus Speed.
8. C. With Low Voltage Differential, any signal that is going to be sent

across the bus will have two wires to carry it.


9. A. The maximum Bus Speed of SCSI-2 is 5.0 to 20.0 Mbytes/second

for a 16-bit data bus.


10. C. Yes, it is possible, but it is not recommended.
11. A. A SCSI-3 bus can have 16 devices.
12. E. SCSI Fast 20 is another name SCSI Ultra.
13. C. For any signal that is going to be sent across the bus, there are two

wires to carry it. One wire will carry the signal the other wire will be
attached to ground.
14. D. With RAID 3, the data is striped in bytes.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Answers to Review Questions

49

15. E. RAID 5 data is striped at block level across all of the drives in the

chain.
16. A. Whether a drive is hot swappable has nothing to do with its status

as a hot spare.
17. B. SCSI-2 was backward compatible with SCSI, but for maximum

benefit, it was suggested that you stick with one technology or the
other, preferably using a SCSI-2 controller with SCSI-2 devices.
SCSI, SCSI-2, and Ultra SCSI all use a 50-pin connector that is
interchangeable.
18. D. RAID 0+1 is a hybrid approach where an entire stripe set without

parity is actually mirrored or duplexed.


19. C, D, E, F. Wide SCSI, Wide Ultra SCSI, Ultra 2, and Ultra 160 use a

68-pin connector.
20. A, C. Hardware RAID costs more because of the special controller

and disks that need to be purchased but it provides significantly better


performance than Software RAID. In addition it will support Hot
Swap of drives that fail.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Chapter

IDE Devices
SERVER+ EXAM OBJECTIVES COVERED IN
THIS CHAPTER:
 3.3 Add hard drives.


Verify that drives are the appropriate type.

Confirm termination and cabling.

For ATA/IDE drives, confirm cabling, master/slave, and


potential cross-brand compatibility.

Upgrade mass storage.

Add drives to array.

Replace existing drives.

Integrate into storage solution and make it available to the


operating system.

Perform upgrade checklist, including: locate and obtain


latest test drivers, OS updates, software, etc.; review FAQs,
instructions, facts and issues; test and pilot; schedule
downtime; implement using ESD best practices; confirm that
the upgrade has been recognized; review and baseline;
document the upgrade.

 3.6 Upgrade adapters (e.g., NICs, SCSI cards, RAID, etc.).




Perform upgrade checklist, including: locate and obtain


latest test drivers, OS updates, software, etc.; review FAQs,
instructions, facts and issues; test and pilot; schedule
downtime; implement using ESD best practices; confirm that
the upgrade has been recognized; review and baseline;
document the upgrade.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

t wasnt that long ago that SCSI drives were selling for about
$1,000 a gigabyte and memory was selling for $100 a megabyte. Maybe it
wasnt that long ago by the calendar, but in computer terms, it was eons
ago. In the late 1980s, Integrated Drive Electronics (IDE) drives were introduced as a lower-price-point alternative to SCSI drives, or some of the
other high-priced, low-performance alternatives. Since the 80s IDE drives
have come a long way, to the point where they are being shipped in an estimated 90% of all systems sold. Lets take an in-depth look at the history
of IDE drives, how they have overcome some of the barrier limitations
imposed by the technology, and where IDE technology is today. We can
then take a look at the types of cabling and connectors required to install
IDE devices, and some of the differences between IDE and SCSI.

For complete coverage of objective 3.3, please also see Chapter 1. For complete
coverage of objective 3.6, please also see Chapters 1 and 10.

IDE Disk Drives and Subsystems

s you remember from taking the A+ exam, disk subsystems are made
up of the hard disk, the cabling, and the disk controller. Last chapter, in our
discussion of SCSI, you saw how the controller had to be matched to the type
of SCSI technology, and how the controller played an active part in moving
data and instructions. Disk controllers can be integrated into the mainboard,
or they can be on a board that plugs directly into the mainboard. Sometimes
these are called controllers, but you may also see the terms paddle cards or
even paddleboards.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

IDE Disk Drives and Subsystems

53

The reason they are referred to as paddleboards is simple. The boards do


not do any controlling. All controlling functions are handled by the electronics
on the drive, hence the name Integrated Drive Electronics (IDE). The original
version of IDE drives helped increase the amount of storage space the average
user could purchase without filing for bankruptcy.

Like many of the computer industry acronyms, IDE has picked up several
definitions. Depending on the book you read, it may be Integrated Device
Electronics, or it may be referred to as Integrated Drive Electronics. This falls
under the tomato (toe-may-toe)/tomato (toe-mah-toe) argument; it really
doesnt matter much where the acronym came from as long as you know
what it is referring to and the darn things work.

Since the drive was controlled by electronics on the drive, the drive
manufacturers could encourage enhancements, because there were no
pesky controller compatibility issues to contend with. Each manufacturer
was free to include some new techniques that would increase capacity,
speed, and the average time that the drive could operate without failure,
called the Mean Time Between Failure (MTBF). Some of these advances
included error checking, or the ability to automatically move contents
from blocks that were failing to blocks that were specifically set aside for
the purpose, or generating higher disk rotation speeds to ensure faster
data access, and even giving the user the opportunity to re-map the drive
geometry if desired. Lets take a look at the history of IDE to track where
it has been until today.

IDE and ATA


It seems like the terms IDE and AT Attachment (ATA) are joined at the hip.
If you see IDE, you are more than likely to see ATA, and if you see ATA, it
is some form of IDE device. ATA is really the formal title of the standards for
IDE drives and how they operate. IDE is kind of like Kleenexit is more of
a trade name that refers to a 40-pin interface and the integrated controller
design. The official name should really be ATA devices.

IDE hard drives, but those are by no means the only devices that take advantage of IDE technology. There are also things like IDE tape devices and IDE CDROMs.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

54

Chapter 2

IDE Devices

As we mentioned above, IDE was originally designed so the disk controller was integrated into the drive itself. This meant that the drive no longer
had to rely on a stand-alone controller board for instructions as all of the
other types of drives did. This integration brought the cost down. It also
made the drives firmware implementations easier to manage for the manufacturer. This meant you had a device that didnt cost very much, and was
exceptionally easy to install. People loved it and the boom of the disk drive
industry was on.

ATA History
When ATA was introduced in the late 80s, it was a hard drive only type of
technology. At the time the ATA standard was approved, applications and
operating systems came on diskettes and only the real computer aficionado
had a CD-ROM device. Most CD-ROMs at the time were SCSI based and
expensive. Since there werent many, if any, things being distributed on CDs,
the CD-ROM was not a necessity for most folks.
As applications and operating systems grew, diskette distribution
became unwieldy, not to mention expensive. There had to be a better
way, and that better way was to distribute software on CDs. After all,
CDs were very inexpensive and could hold almost 700MB of data. CDs
were relatively impervious to end users. For an end user to do something
to damage a CD, they had to work pretty hard.
Now it became imperative that a reliable, low cost method be made available to distribute CD-ROM drives to the masses. The designers of the ATA
specifications suddenly needed to come up with a way to attach things like the
CD and the various tape drives or other storage devices on the existing disk
subsystem. Using the same ATA controller card to manage two devices would
be infinitely more viable than having to put yet another controller card in an
already crowded computer bus. So, the designers came up with something
called the ATA Packet Interface (ATAPI). ATAPI is a fancy name for an extension of the ATA interface. The extension is designed to allow several other
types of devices to plug into an everyday, ordinary old ATA 40-pin cable.
There are some differences in the way ATA supports hard drives and the way
it supports other devices. The hard drives receive support through the system
BIOS. It is up to the BIOS to define the geometry of the drive. These other devices
required a special device driver to support them. So, for example, if you had
installed an early version of the SuperWhizBang 8 X CD-ROM, you would originally need a driver from SuperWhizBang so the system would recognize the fact
that the drive was there. Back in the old DOS days, this required editing the
AUTOEXEC.BAT and CONFIG.SYS files to make sure everything worked just the

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

IDE Disk Drives and Subsystems

55

way it was supposed to. Depending on the operating system you are using, there
may still need to be some manual configuration of devices.
The standards continued to mature, and CD-ROM manufacturers started
working together to provide support for ATAPI. As ATAPI drives became
more standardized, operating systems, and in many cases the BIOS, were
able to recognize the CD-ROM. If the O/S or the BIOS could recognize the
drive, it could immediately load the driver, and if the BIOS can recognize the
CD-ROM, the CD can even be used as a bootable device. This eventually led
to some new advances that we take for granted today, with things like CDROMs that will autorun programs to start installations.
Back to the good-ol-days. When CD-ROMs became viable, it brought up
another shortcoming of the early ATA standard: That was the number of
devices you could have in an ATA chain. With the early drives, you could have
a maximum of two drives connected to a paddleboard and there could only be
one paddleboard in the computer. As you will see, the later implementations
of the standard increased the number of ATA channels in any machine to two,
so you can now have up to four ATA devices in a system. We will discuss how
to configure those four devices a little later in the chapter, in the section on
master/slave/cable select.

IDE, the Next Generation


So, we move on in the history of computing to the early 90s. The ATA architecture wasnt keeping up with the advances in hard drive technology. The
hard drive industry came up with a new standard, and for the lack of a better
name just called it ATA 2. ATA 2 was a great improvement over ATA because
it defined faster data transfer modes using Programmed Input/Output (PIO)
and Direct Memory Access (DMA). PIO is a way of moving data from the
storage device so that all the information has to be sent through the processor.
Different levels (called modes) define PIO, and each mode has its own data
rate. This is shown in Table 2.1.
TABLE 2.1

PIO Levels and Data Flow Rates


Mode

Rate

3.3 MBytes/sec

5.2 MBytes/sec

8.3 MBytes/sec

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

56

Chapter 2

IDE Devices

TABLE 2.1

PIO Levels and Data Flow Rates (continued)


Mode

Rate

11.1 MBytes/sec

16.6 MBytes/sec

As the need for speed increased, the PIO standard couldnt keep pace. That
was when DMA came into being. Instead of the device sending information
through the processor, now the information was written directly to memory.
Because the information is written directly to memory, the Central Processing
Unit (CPU) doesnt have to do anything with it, so the overall performance of
the computer is increased. DMA and Ultra DMA can increase processing
speeds to 100 MBytes/second, but we are getting ahead of ourselves.
Back to ATA 2. In addition to the different methods of handling data,
there were many other under-the-hood kinds of things that the average user
probably wouldnt be aware of. These included things like some powerful
drive commands, like the Identify Drive command. This command was a
godsend to technicians everywhere. Prior to the standardization of the
Identify Drive command, the technician who installed the drive had to
know some exact information on the way the drive was configured. That
usually wasnt a problem if it was the original installation of the drive and
you had all the documentation right there, but if the drive were ever
moved, or pulled out of one machine to be used in another, the configuration information tended to get lost. (Not that something like that would
ever happen to menope, never happen, because I always write the drive
specifications on the outside of the drive with a permanent marker. And if
you believe that, let me know, I have a great bridge to sell you just outside
of McCausland, Iowa.) Then you had to search for the documentation in
your exceptional filing system or call the manufacturer.
Anyway, that problem went away with the updated drives and updated
BIOS. If the drive was an ATA 2 device, when the drive was installed in a
computer, you simply had to install the drive and turn the computer on.
The BIOS would go out and discover the drive automatically. The drive
tells the BIOS how it is built and then the BIOS makes sure the rest of the
computer knows how to address the drive and how much viable space there
is. It is a wonderful thing. It is really one of the first instances of Plug and
Play, only this installation happened well before the operating system even
started to load.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

IDE Disk Drives and Subsystems

57

Another advance was the way the drives handled the data transfer. Instead
of moving the information bit by bit, or even byte by byte, ATA 2 began to
allow block data transfers, called block transfer mode. Think of it this way:
Imagine you have just gotten back from the grocery store after buying one
months worth of groceries for a family of four. Further imagine that all you
could carry into the house was one item at a time. That would take you a really
long time to get everything into the house. That is the way it was before block
transfer mode came into play. Now, with block transfer mode, compare how
much more efficient it is to carry the groceries in one or two sacks at a time.
It may still take you a while to move all the stuff into the house, but not as long
as the other way. Block transfer mode just moved more information in a single
operation.
These block transfers were made possible by a new way of defining and
addressing the sectors on the hard drive. This was done using a process called
Logical Block Addressing (LBA). LBA had an additional benefit, because it
managed to overcome the early IDE size limit of 528MB.
ATA 2 maintained its backward compatibility with ATA drives. It used
the same 40-pin physical connector used by ATA, and an ATA 2 drive could
be used in conjunction with an ATA drive.
There are some other ways that ATA 2 may be described. For example,
you will hear terms like Enhanced IDE (EIDE) or Fast-ATA. Each of these
is not a standard, but just a different implementation of the ATA 2 standard.
EIDE, which started out as a particular manufacturers implementation, has
become so popular that EIDE has become more or less a generic term.
ATA 2 also introduced the capability of having two channels of two
devices per paddleboard. This meant that the total number of IDE devices
that were possible in a system had climbed to four. The channels were
referred to as the primary channel and the secondary channel.

ATA 3
The next standard is ATA 3. ATA 3 does not do anything for the faster transfer modes, but it does provide for password-based security and better power
management. It also has a technology called Self-Monitoring Analysis and
Report Technology (SMART). SMART will tell you when a drive is going
bad before it exhibits any symptoms that you may be aware of.
If you sometimes wonder why your computer takes a long time to respond
after you have let it sit for a while, ATA 2 may be part of the reason. You see,
it also added some sophisticated power management features that would put
the drive to sleep after it hadnt had anything to do for a while. ATA 3 is also
backwardly compatible with ATA 2, ATAPI, and ATA devices. You may also

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

58

Chapter 2

IDE Devices

see the term EIDE applied to ATA 3 devices, since there has been no significant
improvement in data transfer.

ATA 33 or Ultra DMA 33


Ever faster means ever better, and one step forward with ATA devices was
with the ATA 33 standard. This is where the ATA standard began moving
away from the PIO standard and began taking advantage of DMA.
The ATA 33 specification provided high-performance bus mastering that
gave 33 MBytes/second DMA data transfer. Now, bus mastering sounds like
a real geeky term that must have some deep inner meaning. When you get
down to it, there really isnt much mystery. Bus mastering is a technology
that the drive or controller can use to direct the way data traffic is routed
through the input/output path. In the case of ATA 33, the bus master sent the
information directly to memory, rather than to the processor as had been the
case in previous installations.
Since there were new ways of doing things, there had to be some way
of designating the new technology. Lets face it, ATA 33 is not really
all that sexy, so this implementation can also be called Ultra DMA 33
or just UDMA 33. If you install one of these devices, you will need an
ATA 33 drive, controller, and BIOS support to receive the full benefit
of the technology; however, it is fully backwardly compatible. Like all
the previous implementations of IDE, it can use a 40-pin IDE-type cable
unless one of the following is true:


You are using a low-quality, damaged, or weakened cable.

The system has excessive signal noise caused by multiple drives, a dual
power supply, or even an integrated Cathode Ray Tube (CRT).

The system has been put in overclocking mode, or has been set beyond
the manufacturers specifications.

ATA 66
Well, if ATA 33 moved data at 33 MBytes/second DMA, you will never guess
what rate ATA 66 moves data. You got it. It uses even faster high-performance
bus mastering for a 66 MBytes/seconds DMA data transfer rate. This can also
be called Ultra DMA-66 or just UDMA-66.
If you are going to install an ATA 66 drive, you will need the appropriate
drive, controller, and BIOS. Again, it is fully backwardly compatible with the
previous ATA standards, but the cabling has changed. The change was necessary because the transfer rates became so high that there needed to be more
protection against things like crosstalk and electromagnetic interference

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

IDE Disk Drives and Subsystems

59

(EMI). To handle this problem, 40 additional conductors were added as


ground lines. The ground lines act as shields between the lines that carry the
real signals back and forth. The bottom line is you will still need a 40-pin
cable, but now it will have 80 conductors.
The computers operating system must also support these rates of data
transfer.
ATA 66 Issues
If you are installing this type of disk subsystem in your server and run into
problems, here are some things to check:


Make sure you have the right cable. You can tell you are using a
40-pin/80-conductor cable because it will have a black connector
on one end and a blue connector on the other end, with a gray
connector in the middle. The blue connector goes to the motherboard, the gray connector is for the slave device, and the black
connector is for the master drive. In addition, the cable has something you probably wont be able to see: Pin 34 should be notched
or cut. The reason will become plain in the next bullet.

The motherboard or mainboard controller must be capable of supporting the ATA 66 standard. A compatible controller has a detect circuit that can recognize the fact that line 34 is not present on the cable.
If the detect circuit is missing, the motherboard may be able to detect
the presence of an ATA 66 cable, but may try to configure the device
for a higher transfer rate.

Some controllers may not be able to handle the ATA 66 on both the
primary and secondary channels. If you are having problems installing
the device on the secondary controller channel, you may want to move
it to the primary channel and see if that solves the problem.

Make sure you have the right controller card driver. Make sure the
BIOS is upgraded, and any patches that need to be applied to the
motherboard have been taken care of.

Be sure you are using a DMA-capable operating system and that the
DMA mode has been activated.

Make sure the drive has been configured to run at ATA 66 transfer
rates. Some drives ship with the higher transfer rate disabled by
default; enabling the higher transfer rate is done with either a jumper
switch or with a software setting.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

60

Chapter 2

IDE Devices

ATA 100
The most recent advance in the world of ATA/IDE is the release of the ATA
100 interface. As you can tell from the name, the ATA 100 specifications
allow for the transfer of data at a rate of 100 MBytes/second. This is the
transfer from the host-to-drive bus. The new interface does maintain some of
its history, using the same 40-pin, 80-conductor cable as the ATA 66. This
means that like all the other devices we have talked about so far, the ATA
100 cable can be used with other, slower drives. These can include things like
hard disks, removable media disks, CD-ROM drives, CD-R/RW drives,
ATA tape drives, and DVD-ROM drives.
There are other advances made with ATA 100. One is something that has
been around the computer world for a while: Cyclic Redundancy Check
(CRC). The CRC is a very-high-level method of checking to make sure the
transferred data actually made it through the transfer process without
becoming corrupted. It is just a data reliability check.
It works like this. When the device that is transferring the data gets ready to
send it, it attaches an extra set of bits to every frame of data. These extra bits are
called the Frame Check Sequence (FCS), which acts as a type of verification that
is attached to each frame. When the frame is received, the receiver does the math
and checks to make sure the answer is what it expects. If it does, all is good. If
it doesnt, the frame has been corrupted and it needs to be retransmitted.
Lets look at a really simple example. Remember when you were kids and
had those really cheesy secret decoder rings that came in cereal boxes? That
way you could send messages to your friends, and if the teacher intercepted
them, she couldnt read them out loud. Well, the basis of that was usually
some kind of mathematical formula. We will assume that the sender is going
to multiply everything by 3 and that the receiver knows that. So, we take a
look at a simple four-bit frame:
1101
Now, since we are working with a frame, that is a binary number, not
a decimal number, so 1101 translated from binary to decimal is 13. Since
we agreed we are going to multiply everything by 3, our 13 becomes 39.
Converting that to binary, we have this result:
100111
Now, we are going to make another assumption, and we are going to
assume that our packet is made up of two parts; the first contains the answer,
and the second part contains a sequence of three sets of 10 and then the data.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

IDE Disk Drives and Subsystems

61

When I send this to you it is going to be the answer, 100111, followed by


101010, followed by 1101. So, you will get a string that looks like this:
1001111010101101
When you receive it, you break the transmission down by removing the
101010 and keeping everything else. So you are left with 100111, which
you divide by 3. You come up with 1101, which you compare to the data portion. Since the two parts are equal, life is good and you can go on to the next
transmission. If it is not good, you send a message back to me, telling me to
retransmit the information. Simple, huh? Why is such a simple technique
popular? Well, it has three things going for it. First of all, as you have seen,
there is the way it can detect errors even in extreme circumstances. Secondly,
the whole process does not add very much overhead and, because it is a mature
process, it is relatively easy to implement.
As is the case with some of the other ATA specifications, in order to get the
most out of ATA 100, you should have a controller that meets the ATA 100
specifications. This controller can be either a controller that is integrated into
the motherboard, or it can be external. Your hard disk and other devices
should support the ATA 100 interface and you should have one of the 80-conductor cables that supply grounding.
You may also see the ATA 100 interface referred to as the Parallel ATA,
since all data is being transferred in parallel. This means that the system can
transfer several bits of data at the same time. While this seems efficient, there
are some drawbacks. First of all, IDE is designed to be less expensive than
SCSI, but that 80-conductor cable is expensive to manufacture and it takes up
a lot of space in the PC case. Because the cable (actually any IDE, SCSI, or
floppy drive cable for that matter) is bulky, it can block or inadvertently redirect airflow from the fan, causing the system to overheat. As you know, when
we start talking about Parallel in the world of computers, that means that there
may also be a Serial, and that is what is just around the corner. Several manufacturers are looking at getting much faster transfer speeds for the IDE standards, and this is just not going to happen with the current ATA design.

What Is Next for the ATA Standard?


The latest and greatest for ATA is called the Serial ATA. There is a consortium
of manufacturers working together to bring the new standard to life. These
manufacturers include some of the biggest names in the industry, like APT
Technologies, Dell, Intel, International Business Machines (IBM), Maxtor,
Quantum, and Seagate.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

62

Chapter 2

IDE Devices

This group is shooting for a new interface that will increase throughput to at
least 160 MBytes/second, with later versions reaching 528 MBytes/second. In
order to do this, the cable design is going to have to be radically altered. Instead
of the current 40/80 cable that allows for only four attachments, the new cable
will be much smaller, with only four signal pins and a few more pins for power
and electrical ground. What is this going to do to the current technology?
According to the Frequently Asked Questions (FAQ) at the Serial ATA
Working Groups Web site (www.serialata.org), the new implementation is
going to be designed so that it will drop into a PC and be compatible with the
software, meaning it will run without modification to your current computer
(other than the appropriate controller and devices). Since the cable will be
smaller, they will be easier to route and easier to install.
What about all your old stuff? It is anticipated that there will be a period
where both the old parallel standard and the new serial standard are available.
Now, this could cause a problem. Since both types of devices are going to show
up in the same machine, and since each will have its own interface, the Serial
ATA group expects that there are going to be some adapters to adapt the serial
cable to be able to handle the old 40/80 devices.
Serial ATA is going to support all the normal ATA and ATAPI devices,
including CDs, DVDs, tape devices, high capacity removable devices, and
Zip drives. One of the other goals is to make the devices easier to upgrade,
because the Serial ATA group is planning on eliminating jumper settings for
defining the devices role.

For more information on jumper settings and drive roles, see Master/Slave/
Cable Select and Jumper Settings later in this chapter.

Serial ATA is not planned to be a threat to technologies like USB because


Serial ATA is going to be internal to the computer only. There is not going
to be any type of external interface for either PC storage devices or other
peripherals.

Cabling and Connectors

s we have gone through the discussion of ATA devices, you will


notice that there were basically three types of connectors. The earliest was
the 40-pin connector, which handled things up to ATA 66. With ATA 66,
the throughput was so great that 40 more conductors were needed to lead
into a ground to help prevent crosstalk and EMI. How can you tell them
apart? Take a look at Figure 2.1.
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Cabling and Connectors

63

Now if you look closely, you can notice several things. The first is the
thickness of the conductor channel. With the 40-pin connector, the channel
is much thicker than it is with the 40/80 connector.
FIGURE 2.1

ATA IDE cables

ATA/33 cable
40 conductors

ATA/66 cable
80 conductors

The next thing I would like you to notice is that there is a dark line down
the right side of each cable. This line indicates the location of Pin 1. That will
become important in just a few seconds. Each of these cables, although you
cannot see it, has two other similar connectors on it. One of those connectors
would attach to the controller and the second connector would attach to
another ATA device.

You may be asking yourself why there is only one extra connector. Remember
that with IDE, unlike SCSI, there can only be two devices in a chain; SCSI can
have seven. Depending on the IDE controller, there can be up to two IDE chains
in any computer, for a total of four devices. Also, SCSI can handle external
devices, while ATA cannot.

Lets talk about installation. First of all, take all the usual precautions.
Turn the computer off, and unplug it. Always work with an antistatic mat
and an antistatic wrist strap. The antistatic mat is made of a conductive
material that is set on the top of your worktable, and then the computer or
other component is set on the mat. When the antistatic wrist strap is fastened
to the mat, the electrostatic charge level of anything placed on the mat will

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

64

Chapter 2

IDE Devices

become equalized with the charge level of the mat, and these will become
equalized with the charge level of your body. After the charges have been
equalized, electrostatic discharge (ESD) sparks will not occur. Now, lets
assume that you are installing a device that has the controller for both channels built right into the motherboard. We will also assume that you have
already mounted the device in the case. The first thing you have to do is
attach the cabling to the motherboard. Remember that colored stripethis
is where it comes into play. Since we are going to be adding another device
to the system (yes, this is another assumption), you locate the connector for
the second IDE channel. It should be marked on the motherboard with something really creative like IDE-2. Then, looking very carefully at the motherboard, you will see a small 1 near the end of one of the connectors. That
shows you where Pin 1 is on the motherboard. Now, Pin 1 on the motherboard has to match Pin 1 on the cable or things just will not work. Once you
have located Pin 1 on the motherboard, carefully line up the holes on the
connector with the pins on the motherboard, keeping the 1s together. Push
down gently until the cable is snug to the motherboard.

Here are a couple of tips. First off all, be careful to make sure that the pins are
all lined up with the holes on the connector before pressing down too hard. If
you should happen to bend or break one of the pins, it will probably ruin your
day. That is especially true if the controller is embedded in the motherboard.
That would mean replacing the motherboard, usually an expensive proposition. Secondly, once the cable is firmly attached to the mainboard, take a permanent marker and mark the channel in big bold numbers, so the next time
you have to add something to the IDE chain, you can immediately know which
channel you are dealing with. The channel information is silkscreened on the
motherboard, but I usually need a flashlight and a magnifying glass to read it.
This way is just simpler.

Once the cable is attached to the motherboard, you can attach the cable
to the device you are going to install. Again, check to find Pin 1. If you cant
find Pin 1, look closely at the male connectors on the drive. There will usually be either a space without a pin, or there will be a notch in the plastic connector sleeve. Check the end of the cable, and you may see one of the
pinholes blocked, and you may also see a notch on the cable. Line those up,
plug the cable in, and seat it firmly. Plug the power from the power supply
into the device and you should be ready to power up the computer. The
ATA-66 cable is keyed. Remember the blue keyed end attaches to the
motherboard. If the standard ATA cable is installed in Reverse Pin 1 to Pin
40, the hard drive LED will stay on continuously.
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Master/Slave/Cable Select (CSEL)

65

How do you know if it is there? Well, depending on the computer you are
using, watch what happens when the system boots. Some BIOS implementations will show you the devices they find as they go through Power On Self
Test (POST). Otherwise, you may have to access the BIOS to see if the device
has been recognized, or, depending on the operating system, the new device
may be visible through something like Windows Explorer.

Real World Scenario


ATA/IDE installations, especially of a single device on a channel, are really
painless. They usually go easily. If, however, things are not working as
planned, here are a several things to check. First of all, does it have power?
I have done a great job of installing devices only to forget to plug in the
power lead from the system power supply. If it has power, check to make
sure the ATA connector cables are securely fastened to the device and to
the motherboard. You may need to unplug them and re-seat them. Dont
try to tell me they look snug, just unplug and re-seatit saves time arguing. If things still arent right, recheck the Pin 1 to Pin 1 issue. Those are the
three most common (and most embarrassing) issues you can have with
ATA installations.

Master/Slave/Cable Select (CSEL)

ll through this chapter, we have been mentioning the fact that there
can be two devices, and only two devices, in an ATA subsystem. When you
start examining the advantages and disadvantages of IDE versus SCSI, that
is just one of the areas where IDE falls short. The other area is in the way the
two devices are linked together.
As you know by now, IDE stands for Integrated Drive Electronics. All of the
drives intelligence is on board every single drive. That is a great thing if you
have only one drive on the subsystem, but when there are two drives hooked
together and both want to be the brains of the operation, things dont work well.
With IDE devices, you have to relegate one of the drives from being the brains
of the operation to being the go-fer. It is called designating one of the drives to
being the master, and one of the drives to be the slave.
So, there are two ways a single channel of IDE components can be strung
together. Take a look at Figure 2.2.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

66

Chapter 2

IDE Devices

FIGURE 2.2

IDE drive configurations

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Master/Slave/Cable Select (CSEL)

67

As you can see, in the top part of the diagram there is a single drive
attached to the IDE host adapter. In the bottom part of the drawing, there
are two devices taking orders from the same host adapter.
Defining the master and the slave is done with jumpers. Now, there are
three possible settings:


Default: master with no slave present

Master with slave present

Slave with master present

Jumper Settings
Figure 2.3 shows what the business end of an ATA device looks like. If you
look closely at the picture, you will see that there are three sets of pins circled.
FIGURE 2.3

Business end of an IDE/ATA device


Jumpers

You will also notice that there is a small piece of plastic covering two of the
pins. This very small but very powerful tool is called a jumper. You see, each
set of pins represents a channel that the information signal can take from the
controller to the electronics on the drive. The presence or lack thereof of the
jumper completes a circuit that defines the path the electrical impulses will
take. For example, if there were no jumpers present, the information would
follow the path so the drive would be configured as a master, with no slave
device present. Look closely at Figure 2.4 to see the different types of settings.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

68

Chapter 2

IDE Devices

FIGURE 2.4

Master/slave jumper settings

In Figure 2.4, the master slave selection switch is designated as J8. If there
were no jumpers present, the drive would be configured as a master, with no
slave device present. Having a jumper covering Pins 3 and 4 of Switch J8 may
designate the drive as being the master with a slave present. The other drive in
the chain would then have to have a jumper covering Pins 5 and 6, indicating
that it would be the slave device, taking all of its instructions from the master.

If your controller or motherboard supports two ATA drive channels, each


channel can support a master and a slave device. Each channel must have the
master and slave devices properly configured, or that channel will not work
properly.

Now it would be a wonderful thing if I could tell you that each set of pins
for the master/slave relationship was labeled J8, and that in each and every
case, no jumpers indicated master, jumpers across 1 and 2 indicated master
with slave present, and jumpers across 3 and 4 indicated slave with master
present. It would be a wonderful thing, but it would not be the real world.
Now, while it is generally true that the absence of a jumper usually indicates
a master with no slave present, in the real world, things just may not be what
they seem. When you are configuring any ATA devices, be sure to check the
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Master/Slave/Cable Select (CSEL)

69

appropriate documentation. If you cant find it, check the Web. Be sure to
check the documentation before you start removing jumpers and putting
jumpers back on again. Trust me, it will make your whole life a lot easier.

If you have installed multiple ATA devices, and one of them is recognized by the
system and the other isnt, or if neither of them is recognized by the system, shut
the machine off and start over. Your jumpers are in the wrong place. If you get
things really flummoxed, you may want to go back to the beginningin other
words, install the first device as a master with no slave. Check to make sure it is
recognized. Remove the first device, and install the second device as a master
with no slave; check to make sure that it is recognized. Once that has been done,
you know both your devices are good. Then configure one as the master and the
other as the slave and install them. Check to make sure they are both recognized.
If not, check to make sure the cable is tight. If the cable is tight and one (or both)
of the devices is still not being recognized, and you are absolutely positively certain the jumpers are 100% correct, replace the cable. Make sure you replace the
cable with the right type cable for the most advanced type of ATA device in the
chain. In other words, if you have an ATA 66 device in the chain, you should be
using a 40/80 cable. In this scenario, the potential problem areas are the jumper
settings and the cable. As a last resort, replace the jumpers. Sometimes they lose
the metal sleeve that covers one of the pins, and contact is not made.

Cable Select
Now, if you have been really sharp, you will have noticed that there were three
sets of pins shown in Figure 2.3, and up until now, only two sets of those pins
had a reason to be jumpered. The third set is for cable select (CSEL), which
does just what it says, it lets the controller decide which drive will be the master
and which drive will be the slave. CSEL is one of those features of the ATA
specifications that have been around for a while, but you may have never had
an opportunity to work with it. There were some problems with the original
specifications. Look at Figure 2.5, which shows that CSEL has one drive
added, and it has been assigned the drive letter C:.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

70

Chapter 2

IDE Devices

FIGURE 2.5

CSEL with letter C: assigned to drive


Typical 1-drive cabling using old CSEL

Drive 1
C:

Look at Figure 2.6 to see what happens when you add a second drive.
FIGURE 2.6

CSEL with two drives assigned


Typical 2-drive cabling using old CSEL

Drive 0
C:

Drive 1
D:

You will see that the cable select has automatically assigned the letter D: to
the second drive. Now, if you have completely used all of the space on the first
drive for C: and the second device is a CD-ROM, all is fine. What happens if
you havent? Say you have created a drive and assigned the letter D:. Now you
have chaos.
Some controller manufacturers have done serious work to solve the problem,
but for the most part, especially in a server implementation, you may want to be
really sure and just take matters into your own hands and configure the settings
yourself.

SCSI versus ATA/IDE in a Server

This is one of those discussions that is bound to border on religion and


politics. Some people will say never put an IDE drive in a server. Others will
say they work just fine, and with the high MTBF, massive capacity, and
lower cost, why not? So, always being the diplomat, I will say, it depends.
It depends on the implementation and also on the depth of the pockets of
the company buying the server. SCSI offers superior redundancy as well as
unlimited expansion. ATA/IDE cannot match that, yet, just by the limitations
placed on it by the technology. SCSI is also expensive.
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

SCSI versus ATA/IDE in a Server

71

Analyze the implementation. If you have something that needs 99.99999%


uptime, look to SCSI and RAID arrays or duplexing at a minimum. If you have
a server that is not mission critical, and cost is a major factor, look to ATA and
a really good backup system.
Dont get caught in the ATA disk-mirroring dilemma. Some folks will say,
Hey, we are using two IDE disks that cost us $150 apiece for 60GB. We
have just used software mirroring, and so we have a set of mirrored disks for
about $300. Its true, you have mirrored disks, until one of them fails. If the
drive fails that is the slave, the system will perform as advertised. If the drive
fails that is configured as the master, your mirror has just lost all intelligence
and the system is down. At this point, mirroring ATA/IDE drives on the same
channel just does not work.

Real World Scenario


As I mentioned in the chapter, installing IDE devices has really become
close to a no-brainer. The most difficult thing you will have to do is make
sure that you have the master and slave setting correct. The cables will
usually have either a notch that fits into an appropriate spot on the drive,
or one of the pins will be blocked out, making it almost impossible to plug
the cable into the drive the wrong way. That leaves connecting the cable
to the motherboard or to the controller.
Each cable will have a colored stripe down one side. This indicates Pin 1.
When you look at the connector on the motherboard or on the controller,
you will see a small 1 silkscreened somewhere on the silicon. Match the
stripe with the 1 and you should be in good shape. Another way to tell Pin
1, if you have really good eyes, is to look at the solder connection of the pin
to the motherboard. You will notice that all the solder connections are
round, except one. That one is square, and that indicates Pin 1. This can be
described as a fallback if you cannot locate the printed information.
If you are installing devices, there are certain indications that things are not
necessarily correct. For example, if you are adding a CD-ROM to an IDE
chain, when you turn the system on, the power light on the CD-ROM should
come on briefly and then go off. If the light stays on, you either have the
cable reversed or you have misconfigured the master/slave settings.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

72

Chapter 2

IDE Devices

If you are installing new devices into a computer, and things are not working
as planned, the first thing to check is the configuration of master and slave.
After that, check to make sure the cables are plugged in properly.
Oh, yeah, one other thing. Dont be like me. I tried for about 15 minutes to
get an IDE CD-ROM to be recognized by the system before I noticed that
while the master/slave was right, and the cable was in the right way, having
a power cord connected should have also been a priority. There are times
we all do really dumb stuff, and I think I hold the record!

Adding Hard Drives

dding hard drives to a computer can be extremely easy or can present


you with numerous questions and difficulties. The funny part is that its your
choice as to whether things will be easy or not. You must understand the
choices you have relative to hard drives and when to pick which selection.
You must also strive to get some hands-on experience with various hard
drive installations so youre very comfortable with any situation youre
thrown into.

Verifying That Drives Are the Appropriate Type


You first must determine whether youre dealing with an Integrated Drive
Electronics (IDEalso called Intelligent Drive Electronics) or a Small Computer
System Interface (SCSI) hard drive.

The term AT Attachment (ATA) is synonymous with IDE. ATA has gone
through several version iterationsmostly due to increased computer bus
speeds. Visit www.webopedia.com and perform a search on the keyword ATA
for more information.

The easiest way to tell the two apart is to simply look at the connector
cables for each. IDE uses a 40-pin connector and SCSI uses anywhere from
Centronics and DB25 (for SCSI I) to a 50-pin connector for SCSI II and a 68pin connector for SCSI III. Theres no mistaking an IDE cable for a SCSI
cable. So, when in doubt, even if the hard drive doesnt have a label or a cable

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Adding Hard Drives

73

you can count the number of pins it will accept and youll know what kind
of drive youre dealing with.

Note that its possible to mix IDE hard drives with SCSI drives in a system.
Normally I dont like to do that because it can be very confusing to try to figure
things out. Simple is better. But keep in mind that drive mixing can be done.

Another interesting thing that you might get into, though its not as common today, is the need to know the number of cylinders and heads that an IDE
hard drive comes with. In computers with an older system BIOS, the computer
didnt recognize the IDE hard drive until you keyed in the number of cylinders
and heads the drive was using. Then the BIOS would (most times) recognize
the drive configuration and bless it as usable. Today the system BIOS autodetects the hard drives heads and cylinders and you dont have to go through
that rigmarole. The problem with the cylinders/heads scenario is that some
hard drives didnt come with that information stamped on them! You had to
go to a book or get on the Web (or on a BBS in the old days) to download a
schematic for the drive so you knew what to plug into the system BIOS.
SCSI is much easier to set up because you dont have to worry about getting
master/slave relationships right, nor do you have to be concerned about the
BIOS and whether it detected the drives heads and cylinders correctly. On top
of that, you can string several SCSI devices together (up to 7 for SCSI I, 14 for
SCSI II and III) so you can have a veritable Christmas tree of SCSI hard drives.
You have two or three issues to be concerned about with SCSI drives
though. First of all you need to be worried about properly cradling the drives
and getting adequate cooling to them. Its not a wise idea to cram bunches
of SCSI hard drives into a clone towers drive bay just because itll accept
them. Please be cognizant of the heat that a hard drive can put out and the
potential for burning up all hard drives in the system if you dont account for
cooling.
Also, youll have to make sure your SCSI IDs are correct. This is usually
quite easy to do. Most internal SCSI hard drives use jumper pins and youll
simply have to read your drives documentation to tell how to set it to the SCSI
ID youre interested in using. Typically you wont use ID 7. Thats most often
reserved for the SCSI adapter itself, hence the seven-drive SCSI I limitation.
I like to set it up so that in, say, a three-drive system, I set my boot disk
for ID 0, and the next two for ID 1 and ID 2. If you have a SCSI CD-ROM
youre hanging off the system (a pretty rare occurrence), you could set it at
ID 3. Ditto for other SCSI gear.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

74

Chapter 2

IDE Devices

Finally, its important to match the speed of the drives. Older SCSI drives
operate at 7500 RPM but todays SCSI drives run at 10,000 RPM. Its not
a wise idea to hang a 10,000 RPM drive in a system with other 7500 RPM
drives. I dont think itll break anything, but youll see variations in I/O and
could experience some funny activity with the machine.

Confirming Termination and Cabling


There is an incidental item that youll want to pay attention to with SCSI
the termination. Most of todays SCSI cards are self-terminating but youll
still have to consult your drives documentation to see if theres a termination
jumper pin. The basic rule is that either end of the SCSI chain is terminated.
So if your card is sitting at ID 7 and its terminated, youll also have to ensure
that ID 0 is terminated, provided, that is, youve got a hard drive hanging off
of ID 0. Note that its not necessary for you to start with ID 0.

Some older cards search the hard drives counting down in SCSI ID order.
Thus, in a configuration with a hard drive at ID 0 and one at ID 4, the system
would be trying to boot to ID 4 first. This can be, as you might imagine, very
confusing. Future Domain, a SCSI card company that was purchased by
Adaptec, operated this way. Watch out for this unusual behavior!

Youll need to be cautious of SCSI I, SCSI II, and SCSI III relative to the
cabling youll have to do both internally and externally. If youve got a SCSI III
adapter in the computer but the hard drive youre trying to connect to is SCSI I,
then youll need a cable that either has an adapter on one side or is SCSI Ito
SCSI III in design. This cable rule holds true for external devices connecting to
the external SCSI port as well. You can buy cables that are specially matched like
this, or you can simply buy an adapter. It might be a good idea to shy away from
adapters if you can, though in some circumstances you may not be able to.

Cabling and Master/Slave Relationships on ATA/IDE Drives


With IDE systems you generally have a master/slave relationship. EIDE can
support four IDE devices, so you can get into multiple-master/multipleslave relationships, but lets keep it simple for now. If you know that the
system is going to have one IDE hard drive and one IDE CD-ROM, its a
simple decision to make the hard drive master and the CD-ROM slave.
You want the system to boot to the hard drive and it wont do that if its
the slave and the CD-ROMs the master. Conversely, if you were to put

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Adding Hard Drives

75

two IDE hard drives into a computer, youd still be faced with making one
hard drive the master, one the slave. Generally, in situations such as this,
the OS will live on the master hard drive and the second hard drive will be
used for data. Both of the above IDE configurations are quite common.
What happens if you need two IDE hard drives, an IDE CD-ROM, and an
IDE CD writer? Well, then youre stuck with buying a second IDE controller
card or opting for an EIDE scenario. Todays motherboards typically include
IDE connections right on the board. No matter how you connect your hardware, one device will be master, one will be slave.
When setting master/slave relationships, youll almost always have to
adjust a jumper pin on the drive itself. These are clearly labeled. Read your
drives documentation.

Integrating the New Drives into the Storage System


Most of todays servers that are from tier 1 vendors come with a standard
drive cage that is designed to hold up to six drives. You can get servers that
have additional drive cages as well. The drive cage is connected by a cable
to the motherboards on-board SCSI adapter or to a RAID array controller
card. Typically the RAID cards are purchased separately but theyre well
worth adding to your initial server configuration. Youll purchase a certain
number of drives with the server and they will come equipped with connectors that allow them to slide into the backplane of the drive cage. Its a
very slick setup, allowing administrators to quickly change out hard drives
by simply sliding out one drive and sliding in a new one. In some cases you
may have to be cognizant of the SCSI IDs on the drives, but some newer
systems have gotten so fancy that the ID is automatically assigned based
upon location in the drive cage. Any vacant slots for which you didnt purchase a hard drive come with a blank so things look nice and neat.
Its key that you keep track of SCSI IDs so that if you decide to add
another SCSI device to the system, as with an external peripheral, you dont
occupy more IDs than are currently in use on the system.
Generally, in small servers that will have only one drive in them (as in a Windows 2000 Domain Controller [DC] box), youll want to purchase the RAID
card and two drives, then mirror the OS on the RAID card for added fault tolerance. But I think Id take it a step further and purchase a third drive to live in
the drive cage as a hot spare. If one of the hard drives fails, the computer wont
go down because its on a mirror. You can simply Plug and Play the hot spare
into place and rebuild the mirror. Different manufacturers have different methods for setting up hot spares, so read your system documentation.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

76

Chapter 2

IDE Devices

Summary

Between Chapter 1 and Chapter 2, you should have your disk subsystems
covered. If you are asked which is right for your implementation, you will have
a lot to think about, but you should be able to make an informed decision. As
far as the exam goes, make sure you are able to keep each of the ATA specifications straight. Fortunately, they make it relatively easy for you, just with the
naming convention. You should pay attention to things like when the 40/80
cable came into being, and how to choose a master or a slave device.
This chapter describes adding or changing hard drives in a system. You
essentially have two flavors of drives to consider: ATA/IDE or SCSI. There are
constant improvements and upgrades to each category and so you might wind
up changing out a SCSI I drive to a SCSI III that will result in I/O performance
enhancement. Telling the two types apart is easylook at the end of the drive
and count the pins. ATA/IDE is a 40-pin setup, SCSI varies from 50- to 68-pin
depending on the type of SCSI. Youll need to be aware of cabling issues with
SCSIsome installations may require a SCSI ItoSCSI III cable, for example.
Youll also have to keep in mind termination issues with SCSI. Generally
the SCSI adapter (ID 7) will be terminated and you may have to terminate the
other end of the chain as well. External devices use an external terminator,
while internal devices use jumpers for termination.
Modern servers use drive cages and ready-made slide-in devices that allow
for easy removal and upgrade of drives. These slide-in devices are proprietary,
so if youve got a slider for a Compaq computer it probably wont work in a
Dell and vice-versa.

Exam Essentials
Know that ATA and IDE are synonymous ATA is the official standard
defined term for IDE devices, though you will usually hear these devices
referred to as just IDE.
Know that IDE devices have the controller integrated into the drive
Unlike SCSI devices, which use an actual controller, the IDE controlling
device is contained as part of the drive. That is why it is referred to as
integrated.
Know the characteristics of each type of ATA device ATA was the first
type of IDE device, and was very limited in speed, addressable hard drive
size, and number of devices on the IDE chain. ATA-2 used DMA channels

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Summary

77

and PIO to increase speed, and it also introduced moving information


using a block transfer mode. ATA-3 does not increase speed, but it does
introduce security measures and SMART. ATA 33 increased speed and
used DMA and bus mastering. (ATA 33 is also called UDMA 33.) ATA
66 increases speed, uses bus mastering, and also changes the ATA cabling;
it is backwardly compatible with previous ATA devices. To take full
advantage of the ATA 66 devices, you need an appropriate cable and
controller. (ATA 66 is also called UDMA 66.)
Know the characteristics of Cable Select Cable select will determine
which drive is the master and which drive is the slave. It will assign the
master the designation C: and the slave the designation D:
Know why jumper settings are used in ATA devices Jumpers are used
to define devices as either Masters or Slaves. Therefore, there can be three
settings: Master/No Slave Present, Master/Slave Present, or Slave
Know how DMA works DMA is direct memory access. Instead of
requiring information to pass through the processor, the device can write
the information directly to memory. DMA predates bus mastering.
Know what bus mastering is and how it works Bus mastering is a more
efficient use of data flow. It is a program that is either in the microprocessor or in the I/O controller, which pushes information directly onto the
computer bus input/output paths. Once this is in place, data will go
directly from the I/O device to the processor.
Be able to install hard drives Know and understand the differences in
hard drives, cabling, termination, SCSI IDs, and IDE master/slave relationships and be able to explain how to add or upgrade hard drives in a
computer.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

78

Chapter 2

IDE Devices

Key Terms
Before you take the exam, be certain you are familiar with the following terms:
AT Attachment (ATA)
ATA Packet Interface (ATAPI)
ATA 100
ATA 2
ATA 3
ATA Packet Interface
block transfer mode
cable select (CSEL)
Cyclic Redundancy Check (CRC)
Direct Memory Access (DMA)
electromagnetic interference (EMI)
Enhanced IDE (EIDE)
Fast-ATA
Frame Check Sequence (FCS)
Identify Drive
Integrated Drive Electronics (IDE)
jumper
Logical Block Addressing (LBA)
master
Mean Time Between Failure (MTBF)
paddleboards
Parallel ATA
Programmed Input/Output (PIO)
Self-Monitoring Analysis and Report Technology (SMART)
Serial ATA
slave
Ultra DMA 33
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

79

Review Questions
1. What does IDE stand for?
A. Integrated Device Electronics
B. Integrated DMA Efficiency
C. Integral Drive Economics
D. Integrated Drive Electronics
E. Interior Device Efficiency
2. What is ATA?
A. The name of an airline.
B. The actual standard that defines IDE.
C. IDE is the standard that defines ATA.
D. A type of burst mode DMA data transfer.
3. How does PIO work?
A. With PIO the input/output (I/O) goes directly to the processor.
B. With PIO the I/O is sent directly to memory.
C. With PIO the I/O bypasses the memory and the processor.
D. With PIO the I/O is sent simultaneously to the processor and to the

memory.
4. With DMA, where does the I/O go?
A. Directly to the processor.
B. Directly to memory.
C. The I/O bypasses the memory and the processor.
D. The I/O is sent simultaneously to the processor and to the memory.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

80

Chapter 2

IDE Devices

5. Which is the ATA specification where DMA is first used?


A. ATA
B. ATA 2
C. ATA 33
D. ATA 66
E. ATA 100
6. How many pins and conductors are in an ATA cable?
A. 20 pins 20 conductors
B. 20 pins 40 conductors
C. 40 pins 40 conductors
D. 40 pins 80 conductors
E. 80 pins 80 conductors
7. How many pins and conductors are in an ATA 66 cable?
A. 20 pins 20 conductors
B. 20 pins 40 conductors
C. 40 pins 40 conductors
D. 40 pins 80 conductors
E. 80 pins 80 conductors
8. In an ATA 66 cable, what are the extra 40 conductors used for?
A. Expansion
B. +5V
C. +.5 amps
D. Ground
E. Polarity
F. DMA

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

81

9. Monica is a network administrator who has been assigned the task of

upgrading the hard drives in an older server that has been running for two
years now. There is little documentation available for this server. What is
the first thing Monica must determine before she can go forward with her
hard drive replacement?
A. How many drives are in system
B. The type of hard drives
C. The SCSI IDs of all the hard drives
D. The master/slave relationship
E. What type of hard drives are in the computer
10. What is one of the reasons the ATA specifications are getting away

from the 40-pin, 80-connector cable?


A. The cable becomes brittle after several years and connectors break.
B. As data transfer rates increase, the signal inside the cable speeds

up, causing the electrons to go faster. This can cause the cable to
overheat, causing fires.
C. The cable can become twisted.
D. The cable can block airflow from the fan, causing excess heat

inside the case.


E. The specifications will stay the same.
11. How many devices can be in an ATA 66 chain?
A. 5
B. 4
C. 3
D. 2
E. 1

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

82

Chapter 2

IDE Devices

12. Which two of the following choices describe how devices that are part

of an IDE/ATA chain must be designated?


A. CD-ROM
B. Master
C. Optical drive
D. Tape drive
E. Slave
F. Target 1/Lun 1
13. What was one of the issues surrounding cable select?
A. There were none; it worked great first time, every time.
B. It would always designate the master device as Drive E:.
C. It would always assign the slave device as Drive D:.
D. It just didnt work as advertised.
14. What is the usual number of sets of pins available to designate master/

slave relationships?
A. There are usually two sets of three pins.
B. There are usually three sets of two pins.
C. There are normally six sets of two pins.
D. It varies.
15. What is the name of the device that connects the two pins to create an

I/O path?
A. Pin connector
B. Rocker switch
C. Bipolar DIP switch
D. Jumper

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

83

16. You have just been given a new IDE hard drive. When you check to see

if it is configured to be a master or a slave, you notice there are no


jumpers present. What can you derive from this observation?
A. The drive is a master device with no slave present.
B. The drive is a slave device.
C. The drive is a slave device with no master present.
D. The drive is a master device with a slave present.
E. Dont assume anything; check the documentation.
17. Youve recently changed out your servers IDE hard drive with a new

one but you cant seem to get the hard drive to come up and be recognized. There is an IDE CD-ROM in the system as well. What could be
the problem?
A. BIOS doesnt recognize the correct cylinders and heads.
B. CD-ROM is set to be master.
C. CD-ROM and hard drive are both set to be slave.
D. Termination jumper on hard drive isnt set.
18. What is the rated data throughput for an ATA 66 device?
A. 33 MBytes/second
B. 44 GBytes/second
C. 66 MBbytes/second
D. 66 KBytes/second
E. 66 GBytes/second
19. What is another name for an ATA 66 device?
A. Ultra 66
B. Supra 66
C. DMA 66
D. EMA 66

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

84

Chapter 2

IDE Devices

20. If there is a primary and a secondary ATA 66 controller built into a

motherboard, what is the maximum number of ATA devices that can


be put in the computer?
A. 2
B. 4
C. 6
D. 8
E. Unlimited

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Answers to Review Questions

85

Answers to Review Questions


1. A, D. Actually, depending on the data you gather, in some places it is

called Integrated Device Electronics and in others it is called Integrated Drive Electronics.
2. B. AT Attachment (ATA) is the actual standard that defines IDE.
3. A. With PIO, all I/O goes through the processor.
4. B. With DMA, I/O is sent directly to memory.
5. B. DMA is first used in the ATA 2 specification.
6. C. In the early ATA specifications, the cable had 40 pins and 80

conductors.
7. D. In an ATA 66 cable there are 40 pins and 80 conductors.
8. D. The additional 40 conductors are used for grounding to prevent the

introduction of noise and electromagnetic interference.


9. A, E. Monica must first determine how many hard drives shes replac-

ing and what type they areSCSI or IDE. Once she knows what type of
drive shes dealing with, she can ascertain their SCSI IDs or the master/
slave relationship. She should also ascertain the speed of the drives, if
SCSI. Most drive and schematic information is usually available on the
manufacturers Web site.
10. D. Because of the cable width, it can block airflow from the fan, caus-

ing the system to overheat.


11. D. There can only be two devices in an ATA 66 chain; however, there

can be two channels of two devices each.


12. B, E. If you have multiple drives in an ATA Channel, one must be des-

ignated as the master and one must be designated as the slave.


13. C. The slave device was always assigned as Drive D:.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

86

Chapter 2

IDE Devices

14. B. There are normally three sets of two pins.


15. D. It is called a jumper.
16. E. In 99.9% of the cases, the drive will be a master device with no

slave present. Just to be sure, check the documentation. It will save


you the hassle of taking the drive back out and re-jumpering it.
17. A, C. Typically, new IDE hard drives are set for slave, not master. This

is something youll routinely want to check when you purchase new


IDE hard drives. Also, on a computer with an older BIOS, you may
have to key in the amount of heads and cylinders that the hard drive
came with so that the BIOS can recognize the drive. Since youre
replacing the hard drive, chances are remote that the CD-ROM was
set for master. You dont have termination worries with IDEthats
a SCSI thing.+
18. C. The rated throughput of an ATA 66 device is 66 MBytes/second.
19. A. ATA 66 can also be referred to as Ultra 66.
20. B. You can have two devices on each controller for a maximum of

four devices.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Chapter

CPUs and Fibre Channel


SERVER+ EXAM OBJECTIVES COVERED IN
THIS CHAPTER:
 3.2 Add Processors.


On single processor upgrade, verify compatibility.

Verify N 1 stepping.

Verify speed and cache matching.

Perform BIOS upgrade.

Perform OS upgrade to support multiprocessors.

Perform upgrade checklist, including: locate/obtain latest


test drivers, OS updates, software, etc.; review FAQs,
instruction, facts and issues; test and pilot; schedule
downtime; implement ESD best practices; confirm that
upgrade has been recognized; review and baseline;
document upgrade.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

n the last chapter, we spent a lot of time talking about how to link
physical hard disks together to give you more disk space and a sense of
redundancy in case of a failure. Now we are going to move from the disk
subsystem to the brains of the operationthe CPU and the ways that we can
maximize effectiveness.
As you look over the objectives, you will see a lot of attention paid to grouping CPUs together, either as part of the same physical computer with multiprocessing or by taking advantage of groups of servers by clustering. Clustering is
one of those buzzwords that just wont go away. It takes the concepts of RAID,
mirroring, and duplexing to a new height. Basically, we are moving the single
point of failure back from the disk subsystem, back even beyond the server. With
cluster servers, instead of having our data and applications protected by having
an additional disk subsystem, we are providing high availability of data and
applications by having additional servers.

For complete coverage of objective 3.2, please also see Chapters 6, 8, and 9.

Clustering

lthough clustering and cluster servers are the current buzzwords, the
concepts have been around for years. Actually, the implementations have been
around for years. The mainframe, big iron people had clustering almost since
day one, and on the LAN side, Novell had System Fault Tolerance systems
back in the days of NetWare 3. So, we are not talking about new technology.
When you start talking about clustering, you are actually opening up the
discussion of disaster recovery. Now, if you have ever participated in a disaster recovery exercise, you know that it can get to be pretty intense. When you

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Clustering

89

are planning for disaster recovery, the first thing you have to do is determine
how valuable your companys data is, and how long you can live without it.
Most members of senior management will tell you that the data is invaluable
and you cannot live without it even for a second. Then you start showing the
person how you can, in fact, ensure that data is always available with a
99.9999999% uptime, 24 hours a day, 7 days a week, 365 days a year. It is
an impressive display, until you get to the cost. That is when the rubber hits
the road.
So, what is clustering anyway, how does it work, and why is there the
potential for costs to skyrocket?

Clustering Basics
Clustering is basically having redundant, mirrored servers. In other words, if
one of the servers in your network were to fail over to its mirror, the other
server would immediately pick up the slack, and make all of the up-to-theminute data available to your users as well as all the applications that were
running on the failed server. In addition, for this to work really well, the
changeover should be transparent to the end user. In other words, Ursula
User would have no idea whether her requests for data and applications were
coming from Server A or Server B. Nor would she care.
So, now it comes time to determine what a disaster is and how can we protect against it using clustering, because after all, there are several different
kinds of disaster. Well, the first and most obvious example of a disaster is to
have something happen to the file server; lets say that someone was walking
through the computer room with a can of soda and tripped, spilling the soda
into the file server.

Now, okay, so this scenario may not be one that immediately jumps to mind.
But I have seen a file server that handled all services for a small law firm that
was physically located in the break room, in a small enclosure directly under
the coffee pot. Now, of all the places that I have seen servers placed, this was
the second most bizarre. The most bizarre was at a company that wanted to
prove to their customers how technologically advanced they were. To make
sure their customers could see the server screensaver when they walked into
the waiting room, that is where the server was placed. Now, if that were not
bad enough, the keyboard was attached and was active, as was the mouse.
So, anyone who came into the reception area that was really bored could
amuse herself by starting and stopping services or just rebooting the server.
We are not even going to mention the data that was available.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

90

Chapter 3

CPUs and Fibre Channel

Once the soda hits the file server, the smell of burnt silicon starts to permeate the building and that server is officially designated as toast. If this
were the disaster we were protecting against, our clustered server could be
mere feet away and still provide protection. In this case, just having a clustered server in the next room would be all the protection you would need.
Lets say there was a more serious problem. Suppose there was a fire in the
building that housed the file server. Now you can see that the only way clustering would work would be if the machine were physically located in a
different building, but the building could still be close by. If we moved the
disaster up in scale from impacting a single building to something like a flood,
tornado, or earthquake, now the clustered servers need to be several (or many)
miles apart to be safe. We can even take this a step further: Suppose you live
in a part of the world where political unrest is a way of life, or war is comm
onplace. In that case, you may want to have one of your clustered machines
located on the other side of the globe.
Clustering is just making sure that the mission critical business applications
and data that your enterprise requires to operate have high availability, meaning
that they are available 24 hours a day, 7 days a week, 52 weeks a year, year in
and year out. This high availability is usually necessary simply because of the
cost of operations. Lets say that you are talking about the application that runs
reservations for a major international airline. If that application is unavailable
for any reason, for any time, anywhere in the world, the loss of revenues to the
company could be in the millions-of-dollars-an-hour range.
Lets look at another example. Recently I read an article about the IS
department at the National Aeronautics and Space Administration (NASA).
It is responsible, among other things, for the computer network that tracks
the space shuttles when they are in orbit. This involves things like communication, tracking, navigation, life support, small things like that. Can you
imagine what the availability of that system must be every time a shuttle
takes off? I would imagine that having the space shuttle just take another
orbit while we reboot the server is not necessarily an option.

Clustering Technologies
Clustering offers differing challenges as you face each of the scenarios faced
above. Clustering, obviously, is a combination hardware and software solutions to the high availability challenge. This challenge may be something like
making sure that a database application is available no matter what the circumstances, or just making sure that a vital network service like e-mail is not
affected if one of the servers on the network should fail. Basically, a clustered
environment would look something like Figure 3.1.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Clustering

FIGURE 3.1

91

Cluster server environment


Active database server

Networked workstations
connected to the
database server

With cluster server, you have at least two servers, working together in tandem. If something were to happen to either of the servers, the other server
would be able to take over immediately. This means that if any of the server
applications were to fail, the cluster server software would restart any configured applications on any of the remaining servers. This seems to imply
that each of the cluster servers has to be configured exactly the same way,
and that is not necessarily the case. In some implementations you may have
two different applications running on the two servers, and if one server were
to fail, the other server would start the failed application and make it available. Look at Figure 3.2, which uses a database application and an e-mail
application as an example. This is the way the cluster would look before the
fail over.
FIGURE 3.2

Cluster server configuration before fail over


Active database server

Networked workstations
connected to cluster

Active e-mail server

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

92

Chapter 3

CPUs and Fibre Channel

Figure 3.3 shows the way the cluster would respond after a system failure.
FIGURE 3.3

Cluster server configuration after the fail over


Active database server

Access to database
and e-mail maintained

Active e-mail server

But providing access to applications is only half the problem. What about
providing access to the data that can be changed and updated on a minuteby-minute basis? Going back to our example of the reservation system for an
international airline, the application is not much use if the database for flight
and passenger information is not available. Therefore each server node in the
cluster must have access to the same information so that the application and
the data can be moved from one location to another without any downtime.
One of the ways this can be done is with shared external storage. Take a
look at Figure 3.4.
FIGURE 3.4

Shared external storage maintained with a SCSI bus


Cluster
server
nodes

Shared
SCSI bus

SCSI disks

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Clustering

93

In this case, both servers are accessing the same storage location, linked
together with a SCSI bus or a Fibre Channel configuration. It is up to the
cluster server software to decide which node has access to which pieces of
data at any given time. In this configuration, only one node can access any
information at any time. This is one way of making sure that the data on the
external storage is not corrupted.
The disadvantage of this configuration is that due to the limitations of
SCSI, the machines must be located close together. As you can see in Figure
3.5, shared SCSI technology has a distance limitation of just 82 feet.
FIGURE 3.5

Cluster server with SCSI limitations

Cluster
nodes

Shared
SCSI bus

Shared SCSI distance


limitation of 82 feet

SCSI disks

Even Fibre Channel technology has the limitation of several kilometers.


That would work well in our first scenario of someone spilling a soda on a
file server, but may not provide much protection against fire, flood, or earthquake. In this case, if someone in London called in looking for a plane reservation, the caller may have little or no sympathy that a backhoe in
Minneapolis cut a cable that provided data to the airline reservation system.
They would probably ask, and perhaps rightly so, why wasnt something like
this anticipated? Why should a relatively minor mistake in one part of the
world adversely affect computing in another part of the world?
So the challenge then becomes to protect not only the servers and their
applications, but also the data, and not in just a single location, but perhaps
worldwide. If you use clustered servers with RAID storage, with machines
with high-level uninterruptible power supplies, and even have emergency
power available, you appear to be taking precautions against most disasters,
but there are still those disasters that require more redundancy than even a
set of clustered servers can provide.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

94

Chapter 3

CPUs and Fibre Channel

There are solutions available that, for example, can use an IP network to
bypass the limitations of Fibre Channel or shared SCSI. When it comes time
to manage the data, it is handled like this. When there are changes to the data
on the primary node, these changes are captured and are sent via TCP/IP to
the backup node. That way, there is an exact copy of the data stored on the
second disk. If for any reason the primary data storage area should become
unavailable, the data is still accessible. In some cases, the solution can actually create multiple copies of the data, so even the backup is being backed up.
In this way, if there were a problem with the home site in Minneapolis, users
in different areas of the world would not suffer. Configured applications
would be back online within minutes and the data would be up-to-theminute. This would save tons of time over solutions like tape backups, where
the data is, at best, hours old, and at worst, days or weeks old.

Clustering Scalability
When it comes to scaling the clustering solution, you get what you pay for.
In some cases, you may only cluster on a one-to-one basis, so there is little
flexibility. With other solutions, you can configure the cluster to provide a
variety of solutions. Take a look at Figure 3.6.
FIGURE 3.6

Two-server basic cluster


Primary data server

Backup data server

TCP/IP network

In this case, we have the most basic clustering solution where one server
is acting as the primary and the other is acting as the backup. This is the
prime definition of clustering. Any data that is written to the primary is written to the backup. If something were to happen to the primary, the fail over
would bring the backup on line and life would go on with up-to-the-minute
data. In this case, there is one primary server and one backup server.
(I know the term Fibre Channel is new and we have not talked about it yet, but
we will, later in this chapter. Right now it is just important to realize that it can be
used to link servers with storage subsystems and it has a longer distance limitation
than SCSI. We will cover the rest of the stuff later!)

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Clustering

95

That is not necessarily the way it has to work. Using some software implementations, you can configure clustering so there are two primary servers and the data
replication is two-way, as shown in Figure 3.7. Now this configuration does have
a gotcha. In this case, the data has to be independent. Any data that originates on
one server can only be changed on that server. If it is changed on the backup server,
the changes will not be replicated back to the original server.
FIGURE 3.7

Two-way replication with both machines being the primary servers

One machine may be a dedicated


backup or both machines may
provide disaster recovery for the other
WAN

Now, there are other, more creative ways that you can use clustering solutions. For example, you can do what is called daisychaining clustered servers. In this case, lets say that we had some critical data in the office in the
Florida Keys. If the primary server went down, we wanted a rapid fail over,
so users could quickly pick up where they left off. That solution would
require a backup server on site, so we would not have to fight wide area network bottlenecks.
Because this data is critical and because we also understand that the Keys
are subject to hurricanes and other natural disasters, that could render the
two-servers-in-the-same-location solution worthless; we need to make
another backup copy off-site, somewhere far away. In this case, we can daisychain the servers, so there are two servers in the Keys, and another off-site,
away from potential storms and other disasters.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

96

Chapter 3

CPUs and Fibre Channel

Clustering Summary
Clustering is a viable solution, but the level of protection that you get
depends on the level of expenditure that you make. Some clustering solutions
that are right out of the box can handle only a one-to-one server relationship,
and even then, the servers have to be in close proximity. If you want true
disaster recovery capability where the servers are located hundreds of miles
apart, you are probably going to have to go with a specialty solution.

Real World Scenario


When most people think of clustering, they think of Microsofts clustering
solution. That is just one of the ways of accomplishing the task. There are
a variety of third-party vendors that offer clustering solutions for different
operating systems.
Clustering is called for when high availability is a way of life. If you have
servers on your network that absolutely, positively must be available or the
entire company might be at risk, clustering is the way to go. Make sure
when you do decide to cluster servers that you pay close attention to server
placement. Again, the purpose of clustering is high availability. If you place
both servers in close proximity to one another, you may be defeating the
purpose.
When you look at server location, be sure to look at the types of links that
are available between the locations. Clustering can be bandwidth intensive.
If you are going to be moving a lot of data between locations, make sure
that you have the infrastructure to support the demand. Also, you should
plan for the built-in latency of such a configuration. If you are crossing miles
of cable, obviously things are not going to be happening in real time. You
will need to build in solutions for this eventuality.
One of the ways that you can extend the area you cover is with Fibre
Channel. We will be looking at that next.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Fibre Channel

97

Fibre Channel

Now, one of the suggested ways that things be linked together is with
Fibre Channel. Lets take a look and see how that works, and what kinds of
things you can hook together.

History of Fibre Channel


The way things usually seem to work in the computer industry is that someone develops a technology and then after the new technology has been
around for a while, some group comes along and writes the standards for
that technology. If you remember back to Chapter 1, during our discussion
of SCSI that whole process was mentioned several times. Fibre Channel is a
little different. The standards were being developed long before the products
hit the market. As a matter of fact, the American National Standards Institute (ANSI) started working on the Fibre Channel standards back in 1989.
The whole point of the standard was to establish a high-speed connectivity
standard. Originally, as the name implies, this connectivity was to take place
over fiber optic lines. Now, if you look at data coming out of the Fibre Channel Industry Association (www.fibrechannel.org), they are quick to point
out that the channel can be either optical or copper.
One of the challenges that faced networking from the very beginning was
finding a way to move data quickly over long distances. It seems like there
has always been a need for high-speed communication to link host-to-storage or server-to-server. The applications, the processors, the storage, even
the workstations, seemed to get faster and faster, but the ability to get the
data from point A to point B didnt seem to keep up. Fibre Channel is a way
of addressing these needs. It is a gigabit interconnect technology that allows
multiple communications between workstations, mainframes, server, data
storage, and other devices all using either IP or SCSI technology. There are
ways to interconnect systems with multiple technologies to move data at
speeds that approach a terabit per second.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

98

Chapter 3

CPUs and Fibre Channel

Fibre Channel Terms


Yeah, I know, these will be defined later in the key terms, but you better have
a good grasp of them now, or the rest of this discussion probably wont make
much sense. These are some terms that are used a lot when talking about
Fibre Channel communication.
The first two to look at are channel and network. It is important to make the distinction. The channel is the point-to-point communication between two
devices. The point-to-point communication can be either a direct connection,
like between a server and an external disk subsystem, or it can be switched, like
a server-to-server connection using a switch. Channels are hardware intensive
and support the transportation of data at high speeds with low overheard. A
network, on the other hand, is the grouping of nodes that can communicate
based on a common protocol. So, when the user Brandice requests information from the disk subsystem, the request is carried over the network to the
server and the server accesses the information off of an external set of disks
using the point-to-point channel.
Next, there is the term Fabric. The Fabric is the active, intelligent interconnection scheme, or basically how these things connect. When you think of
Fabric, think about the part of the infrastructure that controls the routing.
Ports on the Fabric are called F_ports. You will also see the term FL_port,
which is just an F_port in an arbitrated loop environment.
The data flows between hardware entities called N_ports. The N_port is usually
a termination card that contains the hardware and software necessary to deal
with the Fibre Channel protocol. There must be at least one N_port in each
node, though usually there are two. You will also see the term NL_port, which
is just an N_port in an arbitrated loop environment.
The N_port has a unique address, called the N_port Identifier, and it also
contains a Link Control Facility (LCF).
If you can keep these terms straight, the rest of this section will be much easier.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Fibre Channel

99

Fibre Channel is not really one standardit is a set of standards. Take a


look at the following nested list of Fibre Channel Standards.
Fibre Channel Standards
ANSI X3T11 Fibre Channel Standards and Draft Standards
Fibre Channel Physical (FC-PH)
Fibre Channel Physical and Signaling Interface
Fibre Channel Reference Card
Fibre Channel-PH-2
Fibre Channel-PH-3
Fibre Channel Arbitrated Loop (FC-AL)
Fibre Channel Protocol for SCSI (FCP)
Fibre Channel Protocol for 802.2 Link Encapsulation (FC-LE)
Fibre Channel Protocol for High Performance Parallel Interface
(HIPPI) (FC-FP)
Fibre Channel Protocol for Single-Byte Channel Command Set
CONnection Architecture (SBCON)
Fibre Channel Generic Services (FC-GS)
Fibre Channel Enhanced Physical Interface (FC-EP)
Profiles
FCSI Profiles
FCSI Profile Structure (FCSI-001)
FCSI Common FC-PH Feature Sets (FCSI-101)
FCSI SCSI Profile (FCSI-201)
FCSI IP Profile (FCSI-202)
Gigabit Link Module Specifications (FCSI-301)
Loop Profiles
Private Loop Direct Attach Document
Public Look Profile (FC-PLP)

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

100

Chapter 3

CPUs and Fibre Channel

Other Fibre Channel Specifications and Documents


N-Port-to-F_Port Interoperability
10 Bit Interface Specification
Fiber Channel Management Information Bases (FC-MIBs)
Fibre Channel Optical Converter Proposed Specification

Fibre Channel Basics


Fibre Channel is a system where the designated ports log in to each other
through the Fabric. Given this design, the Fabric can be a circuit switch, it
can be an intelligent active hub, or it can be a loop. With Fibre Channel there
are three different topologies.

Point-to-Point
You remember point-to-point from back in the Network+ class, dont you?
This is the simplest of all topologies. With a point-to-point connection, there
is a bidirectional link that connects the N_ports on two nodes. A point-topoint topology will usually underutilize the bandwidth of the communications link.

Arbitrated Loop
With arbitrated loop, we start looking at a form of Fabric topology.


An arbitrated loop has shared bandwidth, meaning there are more


than two nodes connected.

It interconnects the NL_ports/FL_ports using unidirectional links.

There is only one active L_portL_port connection, so there is the


opportunity for blocking.

In order to guarantee access to the loop, there is a fairness algorithm in


place that ensures that no L_port is blocked from accessing the loop.

If any link in the loop should fail, the communication between all the
L_ports is terminated.

With an arbitrated loop, there is no switched Fabric, so it is less expensive


than Cross-point.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Fibre Channel

101

Cross-point or Fabric Switched


With a Cross-point or Fabric Switched network, you get higher performance
and connectivity.


There is a bidirectional connection between a node (N_port) and the


Fabric (F_port).

If multiple paths are configured between any two F-ports, it can be


configured as non-blocking.

This method has increased overhead with a destination identified


added to the frame header. This helps to route the frame through the
Fabric and get it to the desired N_port.

This topology provides the highest grade of performance and connectivity.

It efficiently shares the bandwidth.

Connectivity is guaranteed and there is no congestion.

If stations are added to the Fabric, it does not reduce the point-topoint Channel bandwidth.

Generic Fabric requirements are defined by standard (FC-FG).

Switch Fabric characteristics are defined by standard (FC-SW).

The topology of the Fibre Channel is completely transparent to the


attached nodes. The ports will all negotiate the type of connection, regardless of topology. There are some caveats though. For example, a point-topoint or an arbitrated loop topology requires that all nodes use the same data
transfer rate at all times. With Cross-point, there can be a dynamic rate conversion. With a switched topology, a 266Mbaud unit could connect to a
1.062 Gbaud unit. In this case, the ports would negotiate the highest shared
data rate for information transmission.

Using Fibre Channel


So far we have established that Fibre Channel is fast and that it has exceeded
the normal SCSI distance limits. Where may you see Fibre Channel in the
future?
One of the technologies that is starting to get more and more play is the technology of Storage Area Networks, where all the servers use one common storage
area for application and data storage. Obviously, in this case the storage would

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

102

Chapter 3

CPUs and Fibre Channel

have to be very-high-performance, which is a perfect implementation for Fibre


Channel. You will also see it in areas like these:


Very large databases or data warehouses

Backup subsystems and disaster recovery implementations like cluster


servers

High performance workgroups, like CAD/CAM implementations

Backbones for small, campus-wide networks

Networks that make use of digital audio and video

Storage Devices and Storage Area Networks


Fibre Channel is being provided now as a standard disk interface. As a matter of fact, controller manufacturers like Adaptec have several different
RAID controllers available that make use of Fibre Channel to the external
subsystem. When these RAID arrays are configured in massive sizes and
shared among servers, they become a Storage Area Network (SAN). To see
what I mean, check out Figure 3.8.
FIGURE 3.8

Diagram of a SAN

External SCSI disk

Server

Server

Server

Fibre Channel RAID

Switch or hub
Fibre Channel RAID

SCSI bridge
SCSI RAID

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Central Processing Units (CPUs)

103

Now the interesting thing about SANs is that both SCSI and IP protocols
are used to access the storage subsystems. The servers and the workstations
all use the Fibre Channel network to get access to the same sets of storage
devices or system. If there are older SCSI devices on the network, they can be
integrated into the Fibre Channel network through the use of the SCSI
bridge. What kind of performance are we talking about? Well, using a gigabit link, bandwidth is reported to be in the neighborhood of 97 MBytes/second for large file transfers.

Fibre Channel as a Backbone


Besides being used as a link to disk arrays or to cluster servers, Fibre Channel
can also be used as a backbone for campus networks. There are several reasons
why it works well in this implementation:


Confirmed delivery of packets.

Support for network resolution protocols like ARP, RARP, and others

Support for point-to-point configurations, shared bandwidth loop circuits,


and scalable bandwidth switched circuits.

Use of true connection services or services using fractional bandwidth.


These can be connection-oriented virtual circuits or real circuits.

Circuit setup time measured in microseconds.

Low latency connection-oriented or connectionless-oriented service.

Full support for time synchronous application like video.

Variable frame sizes.

Central Processing Units (CPUs)

he CPU is the brains of the server. It is responsible for the control and
direction of all the activities that the server participates in, using both the
internal and external buses. The CPU is just a processor chip that consists of
millions of transistors. That is what a CPU is and does. But like most things
in computing, there are dozens of processors.
When it comes to CPUs, there are only a few well-known manufacturers.
The best known, and the two manufacturers that are constantly battling it
out for the title of fastest, are Intel and Advanced Micro Devices (AMD). Of
the two, Intel is probably the more widely recognized, although AMD is
making inroads every day in the desktop and mobile computing market.
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

104

Chapter 3

CPUs and Fibre Channel

Other manufacturers of server-quality processors include Motorola and


IBM, which have combined forces to create the Reduced Instruction Set
Code (RISC) chips. So, because we are talking about servers, we will limit
our discussions to Intel and RISC.

Back in 1965, one of the co-founders of Intel, Gordon Moore, was preparing
for a speech when he made a remarkable discovery. He discovered that the
number of transistors per square inch on integrated circuits had doubled
every 12 to 18 months since the processor was first invented. Moore speculated that the trend would continue, and that became Moores Law. If you
examine the prediction, you will find that it has been remarkably accurate. In
recent years the trend has slowed, and Moore has revamped the law to state
that the density of data will double every 18 months. Why are we mentioning
Moores law? Simpleeverything you are about to read about processors is
outdated. Most of it was probably outdated in the time it took this chapter to
go from my desk, through the editorial process, to the printer. So, if you are
reading this and thinking, What the heck is he talking aboutgigahertz is not
the state of the art, remember that when this was being written, the gigahertz
barrier had just been broken.

Intel Processors
At the time of this writing, the primary Intel processors on the server market
were the Pentium III and the Xeon Pentium III. Intel was also about to release the
first foray into the IA-64 architecture called the Itanium processor. Since the Itanium was scheduled to be the latest, greatest, bestest, fastest server processor on
the block, it was billed as the perfect solution for the large server market, even
before it was released. That left the Xeon and the Pentium III to hold down the
fort on the mid-range and low-end server market.

Intel also has the Celeron processor on shelves, but it was designed for the
lower end, desktop market. Because it is designed for the desktop, we wont
look at it here.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Central Processing Units (CPUs)

105

Xeon Pentium III


At the time of this writing, the Xeon was being sold in speeds of up to 1GHz.
The Xeon came in versions that supported two-way processing or multiprocessing. Two-way processing means having a motherboard with dual Xeon
processors, while multiprocessing can support two or more processors.
Other features of the Xeon include these:


Backwardly compatible with applications that were written for earlier


processors.

Optimized for 32-bit applications running on 32-bit operating systems.

Utilizes the Dynamic Independent Bus Architecture, which will separate


the dedicated external 100MHz system bus and the dedicated internal
cache bus operating at the full processor core speed.

Internal cache memory is a storage area that is designed to hold frequently-used


data and instructions. The processor contains an internal cache controller that
integrates the cache with the CPU. The controller cache does not store the actual
datait stores the RAM location of the actual data, thus providing faster execution of data and instruction sets. Because it is always faster to get information out
of memory than from anywhere else, the more cache there is, the faster the processor seems.

The Xeon can use 1MB or 2MB unified non-blocking, level-two cache.

100MHz system bus speed.

Can access 64GB of physical memory.

In the multiprocessing configuration, the Xeon can support four-way


symmetrical multiprocessing without specialized chipsets and clustering.
With specialized chipsets, it can scale to eight-way configurations.

Requires a mainboard specifically designed with the Xeon chipset.

More expensive than Pentium III.

For a more in-depth look at cache and at memory in general, read Chapter 4,
Memory.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

106

Chapter 3

CPUs and Fibre Channel

Figure 3.9 shows a picture of the Xeon Pentium III processor; the photo
was taken from Intels pressroom at http://www.intel.com/pressroom/
archive/photo/processors.htm.
FIGURE 3.9

Xeon Pentium III processor

Pentium III Processor


The Pentium III is an extension of the older Pentium II processors, which means
that a Pentium III would slip nicely into a mainboard designed for a Pentium II.
The only upgrade that would be necessary would be to upgrade the BIOS on the
mainboard to support Streaming Single-Instruction, Multiple-Data Extensions
(also referred to as SSE or Streaming SIMD). These instructions were designed
primarily for 3D graphics, popular in games.
When the Pentium III first came out, there was a bit of a controversy surrounding the chip. With the P-III, Intel promoted the concept of ID numbering and a random number generator that was based on the fluctuations of
heat within the processor. The plan was to make for more secure Internet
transactions, by turning the ID numbering off or on using software. The
instructions were really designed to make sure that the processors were not
being reworked so they could be overclocked. People concerned with privacy
took this new feature and became concerned that Internet sites would
require users to keep their ID turned on so they could access the site, thus
tracking a users movement over the Internet. The first set of P-IIIs had ID
tracking turned on by default.
The Pentium III has clock speeds of up to 1.13GHz, and it can use a
133MHz system bus.
Figure 3.10 shows a picture of a Pentium III processor.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Central Processing Units (CPUs)

FIGURE 3.10

107

Pentium III processor

The Pentium III is not as expensive as the Xeon processor, and the supporting
cast of mainboard and memory will bring down the cost also.

RISC Processors
You want power, we got power. Of course, like most things in computing,
the more performance you receive, the more you pay for it.
RISC chip servers are at the high end of the server platform, usually
reserved for the high availability, highly accessed Web servers. RISC based
servers can scale from a single processor up to 64 processors in the same
machine. Of course, the cost is going to be considerably higher than the usual
$10,000 to $15,000 price range for a starting server. In the case of a RISC
server, taking the cost well over $100,000 is not unheard of.
RISC is usually associated with Unix implementations, although Windows NT also ran on the RISC platforms.

Advantages of RISC
The RISC processor does offer several advantages over its Complex Instruction
Set Computing (CISC) counterparts:
Speed The name says it all. With RISC, you are dealing with a reduced
instruction set. That means that RISC processors often show two to four
times the performance of CISC processors in comparable technology and
using the same clock rates.
Simpler Hardware Because the instruction set is simpler, it uses up less
chip space. That means that extra functions like memory management or
floating-point arithmetic units can be installed on the same chip. Also,
since the chips are smaller, there can be more parts on a single silicon
wafer, and that reduces the cost per chip dramatically.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

108

Chapter 3

CPUs and Fibre Channel

Shorter design cycle Since the chips are simpler, it doesnt take as long to
design as the CISC brethren. This means that RISC chips can respond to
changes in the hardware marketplace sooner than the CISC designs. This
means there will be greater leaps in performance between the generations.

The Risks of RISC


Many of the problems associated with switching to RISC revolve around
software.
Code Quality The performance of the RISC processor depends on the
code it is executing. When the code is executing, it depends on something
called instruction scheduling to determine how quickly things work. If the
instructions are not scheduled correctly, the processor will spend lots of
time waiting for the result of one instruction before it can proceed to
another instruction. Therefore, the code written for RISC has to be written
and compiled in a high-level language like C or C++. Since C and C++ are
only languages, the quality of the compiler is what makes for efficient code.
Debugging If you have to worry about instruction scheduling, which is
normally an out-of-sight-out-of-mind process, debugging becomes difficult. If scheduling is turned off, the machine language instructions show
a clear connection with their lines of source code. Once scheduling is
turned on, the machine language instructions for one line of code may
show up in the middle of the instructions for another line of source code.
This not only makes the code hard to read, it makes it hard to debug.
Code Expansion Code expansion is where you take code that was originally designed to run on a CISC machine and recompile it to run on a
RISC machine. Since the RISC machine understands fewer instructions,
the same code may require more functions to accomplish the same tasks.
The exact size of the expansion depends on the quality of the compiler and
the machines instruction set.
System Design RISC machines run fast, and that means they have to
have really fast memory systems to feed them instructions. RISC-based
systems typically contain large memory cache, usually on the chip itself.
This is called first-level cache.

RISC Summary
While RISC is exceptionally scalable and works tremendously well in servers
that going to be heavily utilized, the monetary costs can be considerable.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Multiprocessing Support

109

Real World Scenario


The marketing from the motherboard manufacturers and from Intel will tell
you that Pentium III processors will work with motherboards designed for
Pentium IIs and vice versa, and that is true. But what they may not tell you is
that there are tricks involved in installing the processor.
If you buy a motherboard or a system that does not come with the processor, be sure you get all the documentation with the system. There are
switches or jumpers that may need to be set to make sure things match
between processor and motherboard. If they dont match, damage can
occur to the two most costly components of the system. It is truly a case of
RTFM, or Read The Fine Manual.
One of the things you should look for in the documentation is the motherboards beep codes. If there is something wrong during the Power On Self
Test (POST), the motherboard may not be able to display a message on the
monitor because it could be the video subsystem that is bad. Therefore, the
motherboard will announce errors through a series of beeps. At one time,
I worked for a tech support department of a computer manufacturing company, and one of the tests we used to run was to unplug everything from the
motherboard but the processor and the power and turn it on. When the test
was run, you expected to hear seven short beeps. If you didnt hear seven
short beeps, there was a problem with the motherboard.
If you install a processor on a mainboard, be sure the fan is installed properly and is functioning. Heat is a major cause of problems with processors,
and the first line of defense is the fan. Also, be sure the processor is properly
seated in its slot. Many times, new technicians are afraid to push down hard
on the processor because the motherboard may crack or break. Motherboards are more resilient than you may think.

Multiprocessing Support

or those of you who are fans of the American comedian Tim Allen,
perhaps we should just re-title this section More Power! See, unlike Allen, I
dont think it is just a guy thing. I want to make this more politically correct,
because everyone at one time or another wants more power. Certainly the
people on your network do, every time they complain about how slow the
network is running today. One of the ways that you can give them more
power is with Symmetrical Multiprocessing (SMP).
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

110

Chapter 3

CPUs and Fibre Channel

You may be asking yourself, why all this fuss about multiple processors in
a single computer? After all, with the speed of processors getting faster all the
time, wont that take care of the issue?

Multiprocessing Basics
This will be like the discussion of RAID, complete with a whole new set of
acronyms and strange terms. Bear with me and it will make sense. First of all,
why might you need SMP?
When you take a look at the world of the uniprocessor (UP), you realize that
the processor is actually doing a lot of work at the same time. For example, the
processor may have a fixed-point arithmetic unit and a floating-point arithmetic
unit all on the same CPU. That means that the processor can run multiple
instructions within the same CPU. The thing to keep in mind is that while several
instructions can be run in parallel, only one task can be processed at a time.
Look at Figure 3.11.
FIGURE 3.11

Uniprocessing

Task

Task

Task

Task
Processor

In this case, you have multiple tasks backed up behind a single processor.
Now, you are probably saying to yourself, Wait a minute, he just said that
processors can perform multiple instructions at the same timewhat is the
difference? Think of it this way. Imagine yourself drying dishes after a big
meal. Each dish is a task. You may be able to dry multiple parts of the dish
at the same time, but you cannot dry multiple dishes at the same time. Adding another processor to the mix, like bringing in another person to help, will
cut the number of tasks down by half, and speed up the process of drying the
dishes. Figure 3.12 shows what I mean.
FIGURE 3.12

Multiprocessing

Task
Processor 3
Task

Task

Task

Task
Processor 2
Task
Processor 1

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Multiprocessing Support

111

Now, you would think that, like bringing in another person to help dry the
dishes, adding another processor would increase the overall performance of a
system in a directly proportional fashion. In other words, if you added a second processor to the system, the system would be twice as fast. It would be
wonderful if it worked that way, but it doesnt. You see, there a lot of other
factors that have to be taken into consideration. The problem is not just buying a motherboard that is compatible with two CPUs. All of the chipsets on the
motherboard have to be able to work with more than one CPU. The CPUs
themselves have to have hard-coded programming to work in parallel and,
once all the hardware is in place, the operating system has got to be able to
handle multiple processors. Now, all that has to happen to make sure these
two processors can work in tandem. Can you imagine how much behind-thescenes stuff has to go on to work with up to 64 processors? Not only that, but
there is still one more piece to the puzzle and that is the application.

Threadsthe Building Blocks of Tasks


A thread to the processor is the very smallest part of a task. For example,
when your alarm went off this morning, the reflex that caused you to move
your hand to shut off the alarm was a thread of that task. Now, if you are
looking for a more definitive explanation of a thread, it can be defined as a
concurrent process that is part of a larger process or program. In a multitasking operating system, a program may contain several threads, all running at the same time, inside the same program. This means that one part of
a program can be making a calculation while another part is drawing a graph
or a chartor making another calculation using another processor.
So, in order to use multiple processors, you have to have the ability to use
multiple threads, and these threads can be a piece of program, as long as that
piece does not depend on other pieces. In other words, when we use threads
in a multiple processor environment, each thread has to be totally focused on
its job. The thread can be split off and run either in serial or in parallel to
other threads without affecting its function.
There are two kinds of threads: kernel and user. The difference is what has
control of the thread and what is aware of the thread. In a kernel thread, it
is running close to the base operating system, so the kernel is aware of the
thread and has the actual control over it. With a user thread, on the other
hand, the process that spawns the thread has control over the thread. User
threads are usually faster to create and you are able to quickly switch contexts. User threads cannot be used across multiple processors, because the
kernel decides which CPU the thread is run on. If the kernel does not know
about the thread, then the thread can only run on one CPU.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

112

Chapter 3

CPUs and Fibre Channel

SMP Hardware
Obviously, when you are talking SMP, the hardware is important. With
some of the earlier Intel CPUs, you could mix and match older CPUs of close
clock speed, so you could do things like put a Pentium 166 in with a Pentium
200. You would just have to set both CPUs to run at either 200MHz or
166MHz, which of course could affect system stability.
Things are a little different with the more recent CPUs. With the more
recent systems, a multiplier is used. This multiplier is put into play to multiply the CPU bus clock rate, called the Front Side Bus (FSB) rate, by the multiplier. So, with a Pentium III 500, the FSB would be 100MHz and the
multiplier is 5, giving you the 500MHz. Intel now sets locks on the multipliers used in the CPU to control the final clock speed. Because of this, if you
are running multiple CPUs with Intel, you have to make sure the clock speed
and the multiplier are the same.
The CPU also has to have the onboard circuitry to work with other CPUs in
the same system. If that circuitry is not there, the CPU will simply not take
advantage of the other CPUs. This should not be a worry, because all of the Intel
CPUs that have been developed since the early Pentiums have had the ability to
work and play well in an SMP environment.

Intel does make the distinction between the dual processing (DP) environment and the multiprocessing environment (MP) for chips that are marked
with a VSU. If your Pentium chip has the VSU marking, it means the chip
has been validated to work in a uniprocessing and multiprocessing environment, but not in a dual processing environment. The difference is that DP is a
special mode of operation for two Pentium processors where there are four
dedicated private pins and there is the specific DP on-chip circuitry. This circuitry allows the processors to handle the negotiation of how to use the
resources and the data buses. Since there is no operating system intervention
required, this is referred to in Intel literature as a glueless solution. An MP
setup requires the glue like the operating system to negotiate between the
processors.

There are limitations to the number of processors that can be used. For
example, some Pentium IIIs can only be used in pairs, while the Pentium III
Xeon can be used in eight-CPU configurations.

Operating Systems
Having the hardware able to recognize the multiple CPUs is one half of the
battle, but the other half of the battle is the operating system. The operating
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Multiprocessing Support

113

system has to be able to figure out that there is more than one CPU present
and also load the proper kernel. The kernel must then be multithreaded to
take advantage of the multiple CPUs inside the OS. This is really a bigger
issue than it may sound, because many of the system calls are static and cannot be reconfigured to work in a multithreaded environment. In that case,
some locks have to be put in place to make the system calls static.
The operating system is also responsible for system stability. It has to
manage the caches of the different CPUs. That management can get tricky,
because it has to make sure that the contents of the cache match each other,
as well as the original data, whether it is stored in RAM or on a disk. This
is one of the major hurdles for running multiple CPUs.
The OS must also support all of the processors that are available in the
hardware. An example would be that Windows 2000 Professional only supports 2 CPUs, so if you ran it on a system where there were four, two
wouldnt be used. Windows 2000 DataCenter Server supports 32 CPUs out
of the box. If you are using Linux, some of the Linux kernels natively support
16 CPUs, although the kernel code can be rewritten so more than 64 processors will be recognized.

Applications and SMP


Every application that runs on an SMP system has to be written or coded
very carefully to take advantage of these kernel threads whenever it can.
Since each application is different, using different sequences and using a different set of dependencies, the developer must figure out how to make the
best use of the multiple processors. When the developer has determined
those parts that can run independently of each other, these can be split into
separate chunks that are then perfect candidates for separate threads. For
those parts that arent independent, there must be some form of internal
locking mechanism to protect the calculations and the data they contain.
Because of the locking mechanism, there may be times that the developer
opts to not make use of the SMP. It may be that using another processor for
an independent piece would cause data corruption, meaning that it was more
trouble than it was worth.

You may also see the term processor or CPU affinity. This is the process of
selecting specific applications or processes to run on a specific processor. For
example, you may have a quad processor machine and want a database
indexing function to run specifically on the fourth processor. This would be a
function of affinity.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

114

Chapter 3

CPUs and Fibre Channel

SMP and Performance


When it comes down to SMP and performance, a lot can be summed up with
the phrase, It depends. For example, if you are using two processors on a
desktop machine that is running Windows 2000 Professional and you are
working with an application that is not calculation intensive, like Word, you
probably wont notice much of a performance increase. However, if you are
using some really math intensive applications like CAD/CAM, you may see
a remarkable increase.
The same is true with servers. The server will show peaks and valleys of
SMP performance gains. The server can use the multiple processors to offload its I/O processing across the CPUs. The difference is most noticeable
when there is heavy server usage. Extra CPUs and better handling of I/O will
make massive improvements in the performance when performance is
needed most.

Real World Scenario


I dont know about you, but one of the problems that I have is the "more is
always better" syndrome. You know what I mean: If 256MB of RAM is good,
512MB has to better. If one processor is good, two must be better, and four
must be better still. As a matter of fact, I have a friend who is running a mail
server in his office with four processors in it. He thinks it is really cool. The rest
of us think it may be a tad bit of overkill to run a mail program that services less
than 100 people. So, when it comes to processors, more being better may not
necessarily be the case. Before investing in a super server with 16 processors,
examine your alternatives and examine the applications that are going to run
on the server. You may find that the operating system, contrary to marketing
claims, may do a great job with multiprocessor support for two or four processors and may level off after that. You may increase your performance and get
better results (and it may even be cheaper) if you buy four servers with four
processors rather than one server with sixteen processors.
Do your homework and remember that more is not always better.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Adding Processors

115

Adding Processors

Adding processors can be a much less scary prospect if you do some basic
research on the server youre upgrading prior to attempting the addition.
There are several key things to keep in mind as you toy with the idea of
upgrading or adding new processors.

Verifying Compatibility in Single Processor Upgrades


Its extremely beneficial for you to read your product documentation or consult with the vendor or manufacturer regarding processor upgrades. The first
thing you need to ascertain is whether you can add multiple processors to your
computer. If so, how many can you add and what types will the bus accept?
If your server can accept only one processor, you must ascertain what types of
processors you can add. By types, I mean what brand of processor and what
speed? For example, if the computer currently has a Pentium II 200MHzz processor in it, can you upgrade to a Pentium III (answerno)? Can you upgrade
to a 450 or 500MHzz processor (answerprobably).
There is probably no greater criticality in server upgrades than being sure
to correctly identify and procure the processor upgrade that your computer
manufacturer has said will be a suitable upgrade for the system. If you dont
follow manufacturer guidelines and go with a clone, youll likely encounter
many problems.

Verify N 1 Stepping
In the world of CPU manufacturing, the word stepping is akin to a version
number. When a new microprocessor is released, the product version is set at
step A-0. Later on, as engineering updates are made to the chip, new steppings
are assigned. If the change is minute, the number of the stepping will be
changed (i.e., A-0 to A-1). If the change is major, the letter of the stepping will
change (i.e., A-0 to B-0).
When considering a CPU upgrade, especially if youre adding a CPU in
order to turn the system into a multiprocessor computer, youll want to verify
the current CPUs stepping and match accordingly, or replace if the stepping
levels are too far from one another. Check with the computer manufacturer or
vendor for more detailed compatibility information.
In a single CPU upgrade, the same caveats apply (matching the stepping
to the range supported by the manufacturer).

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

116

Chapter 3

CPUs and Fibre Channel

Verify Speed and Cache Matching


Additionally, in multiprocessor systems its important to match the existing
CPUs speed with the new addition to the family. This may be extremely difficult to do. If youve got a Pentium II 200MHz processor, you may well be out
of luck finding a match. You might have to upgrade the first processor and buy
a match for it in order to have matching CPUs. All of the CPU speeds in a multiprocessor system must match. You cannot have one CPU running at
133MHz, another at 700MHz, and so onit wont work.
Additionally, you must match the L2 cache on each processor. If the first
processor has a 250 L2 cache, its paramount that the additional processor
youre purchasing have the same size L2 cache.

Note that some processors require a DC power supply and have an associated
slot on the motherboard for the power supply unit. Note for sure whether your
processor implementation has this and order accordingly. If in doubt, ask the
manufacturer or consult documentation.

Summary

You know the problem with writing a chapter like this? As you write
about all the exceptional technology, you just want to go out and set up a
cluster of servers with four Xeon processors, a couple of gigs of RAM, and
a storage system hooked to the main box by Fibre Channel with a few
terabytes of disk space, just to see me if you can get it to work! Hmm, maybe
I could build it and sell it to my wifes company. Do you think that might be
a little overkill for a home-based business with 10 employees?
Anyway, enough of all this dreaming stuff. On to Chapter 4, where we
look at what kinds of memory to put into that big ol badboy.
You know, I could probably get all those servers to fit.
We talked about adding processors to a system. Its important to verify
your systems capabilities, either by checking with the manufacturer or consulting system documentation. Systems are typically rated for a given range
of microprocessors so you may not be able to run out and buy the latest and
greatest processor, slap it in your system, and hope that it works. Its important to understand your systems limitations.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Summary

117

Be sure to check the current processors stepping, L2 cache capacity, and


processor speed so that in a multiprocessor environment youre purchasing
like equipment. Dont mix L2 cache or processor speeds. Check with the
manufacturer to see what are allowable steppings between processors in a
multiprocessor system.

Exam Essentials
Know what it means to cluster servers Servers are clustered for a variety
of reasons, usually to make sure that the single point of failure is moved
back beyond the server. You can think of cluster servers as mirrored servers, though in reality, clustering can provide a broader range of services
than just fault tolerance.
Know what high availability means High availability is one of those
buzzwords that means exactly what it implies. You want your network to
be available always, 24 hours a day, 7 days a week. You take all the steps
necessary to make sure your server is up and running to provide the
appropriate services and applications to your users. It is highly available.
Know the basics of Fibre Channel Fibre Channel can be used to link
storage subsystems (or other devices) to the network. It provides faster
throughput. Fibre Channel makes use of ports connecting using the Fibre
Channel fabric. Fibre Channel is used in storage area networks, to provide
the bandwidth for remote access to large databases, and to provide bandwidth for remote backups.
Know about the different types of CPUS, including RISC, Pentium II,
Pentium III, and Xeon Xeon supports two-way processing and multiprocessing. Xeon supports four-way multiprocessing without specialized
chipsets. They are more expensive than the Pentium III processors.
Know which CPU you would use in a high availability super server
The RISC processor and the Xeon are designed for high availability and
high utilization servers.
Know the advantages and disadvantages of multiprocessing support
Before adding multiple processors, it is best to do a cost analysis. In some
cases, it may be cheaper to add another server with fewer processors than
it is to add a mainboard that can support more processors. For example,
it may be cheaper to provide a cluster of four servers with two processors
each than a single server with eight processors.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

118

Chapter 3

CPUs and Fibre Channel

Make a checklist Know and understand the things to check for when
upgrading a system processor or adding processors to a multiprocessor
system.

Key Terms
Before you take the exam, be certain you are familiar with the following terms:
American National Standards Institute (ANSI)
arbitrated loop
channel
cluster servers
Complex Instruction Set Computing (CISC)
Cross-point
dual processing (DP)
F_port
FL_port
Fabric
Fabric Switched
Fibre Channel
Front Side Bus (FSB)
gigabit
kernel thread
Link Control Facility (LCF)
N_port
NL_port
point-to-point
Reduced Instruction Set Code (RISC)
stepping
Storage Area Network (SAN)
Streaming Single-Instruction, Multiple-Data Extensions

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Summary

Symmetrical Multiprocessing (SMP)


terabit
thread
uniprocessor (UP)
user thread
Xeon

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

119

120

Chapter 3

CPUs and Fibre Channel

Review Questions
1. When servers are clustered, you are providing redundancy of which

devices?
A. Network cards
B. Mainboards
C. RAID systems
D. Servers
E. Video cards
2. What are the key items that must match when youre attempting to

add another processor to a multiprocessor system?


A. L2 cache
B. Cooling fan
C. Speed
D. Stepping
E. Model
3. Youre working on a server replacement for a two-year-old computer

thats had better days. You need to verify the CPU stepping. How do
you go about gathering this information?
A. Read the serial number on the CPU and call the manufacturer.
B. Read the stepping number on the CPU.
C. See if the NOS reports the stepping number.
D. Obtain the stepping number from the Web.
E. Read the system documentation.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

121

4. Which Intel processor is designed for higher scalability?


A. Celeron
B. Pentium II
C. Pentium III
D. Xeon
E. RISC
5. Louis has recently purchased and added a second processor to his

server. When the server boots, only one CPU reports online and the
NOS error logs report something about an L2 problem. What does
Louis need to check?
A. Secondary cache is mismatched between the two processors.
B. BIOS version is different between the two processors.
C. CPU speed is different between the two processors.
D. DC power converter missing.
6. The Linux 2.2 kernel can use up to 64 processors if what is done?
A. The kernel is rewritten and tweaked.
B. Nothing.
C. The processors are set to operate in parallel mode.
D. The processors have VPU on them.
7. What is the minimum number of servers in a cluster?
A. 4
B. 3
C. 2
D. 1

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

122

Chapter 3

CPUs and Fibre Channel

8. In a cluster server environment, there can only be one primary server?


A. True
B. False
9. RISC chips were designed by what two companies?
A. HP
B. Unix
C. Sun
D. IBM
E. Microsoft
F. Motorola
G. Novell
10. RISC stands for which of the following:
A. Reduced Instruction Set Computing
B. Redundant Installation Sun Coprocessors
C. Redolent Instructions to Sun Computers
11. CISC stands for which of the following:
A. Ciscos symbol on the New York Stock Exchange
B. Computerized Instruction Set Configuration
C. Complex Instruction Set Computing
12. What is one commonly overlooked checklist item to be sure you

include when considering a processor upgrade?


A. Matching of L2 cache
B. Verifying the need for additional DC power supply
C. Matching of CPU speed
D. Matching the CPU stepping number

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

123

13. Alejandro is trying to add a second processor to his server but the

computer wont boot. What could be the problem?


A. Mismatched L2 cache
B. Mismatched stepping number
C. Mismatched speed
D. No DC power supply for second CPU
14. What is the performance difference between RISC and CISC processors?
A. RISC processors often show two to four times the performance of

CISC processors in comparable technology and using the same


clock rates.
B. RISC processors often show 12 to 14 times the performance of

CISC processors in comparable technology and using the same


clock rates.
C. CISC processors often show two to four times the performance of

RISC processors in comparable technology and using the same


clock rates.
D. CISC processors often show 12 to 14 times the performance of

RISC processors in comparable technology and using the same


clock rates.
15. You have just mounted a Pentium III 500 processor on a mainboard

designed for a Pentium II. What else must you do?


A. Nothing.
B. Buy a new mainboard, because the Pentium III will burn out the

old one.
C. Buy a new processor, because the mainboard will burn out the

old one.
D. Flash the BIOS to provide for the extra instruction sets of the

Pentium III.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

124

Chapter 3

CPUs and Fibre Channel

16. When the Pentium III first came out, there was some controversy

surrounding the processor. Why?


A. It was rumored to be made out of parts from the endangered

species list.
B. The floating-point decimal was not always accurate.
C. It had ID tracking.
D. It was thought to gather an inventory of hardware and software on

your PC and e-mail it directly to Microsoft.


17. What are some considerations to take into account when youre add-

ing two more processors to a server that already has two, thus turning
it into a four-way computer?
A. Stepping of all four processors must match.
B. L2 cache of all four processors must match.
C. Speed of all four processors must match.
D. Must have ports available on motherboard for additional processors.
18. What is the minimum number of nodes in an arbitrated loop?
A. 4
B. 3
C. 2
D. 1
19. What agency certificated the specifications for Fibre Channel?
A. ASCII
B. SCSI
C. ANSI
D. EPSIDIC
E. IEEE

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

20. In a Fibre Channel configuration, where is the N_port located?


A. The N_port is usually a termination card.
B. The N_port is usually a main port.
C. The N_port is usually a virtual connection.
D. The N_port is usually a randomly defined memory address.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

125

126

Chapter 3

CPUs and Fibre Channel

Answers to Review Questions


1. D. Cluster servers means that two or more servers are acting in tandem

to provide data and applications in case of a failure.


2. A, C, E. Its important to match the CPUs brand, model, L2 cache, and

speed. The stepping may be importantcheck with the manufacturer.


The cooling fan probably wont matter.
3. A, B, C, D. Reading the system documentation isnt likely to provide

any information on the stepping number of the CPU. Remember that


a stepping number is like a version number and a documentation
booklet isnt likely to give you that kind of information. Your quickest
bet is to see if the NOS reports the stepping numbersome do and
some dont. With some manufacturers, you can go directly to their
Web site, key in the unit number for your computer, and get exact
details for the unitwhich may include the CPU stepping number. Its
doubtful that the CPU itself will include the stepping number, but if
you have to power the box down to get its serial number anyway, you
can take a look to see if its listed.
4. D. The key here is that the question specifically asks for Intel processors.

In that case, the answer would be the Xeon processor.


5. A. Secondary cache errors most likely mean that Louis has got an L2

cache mismatch. L2 typically comes in sequences of 256 so he may


have one CPU with 256K of cache while the new one has an L2 cache
of 512.
6. A. The Linux 2.2 kernel can use up to 64 processors if the kernel is

rewritten.
7. C. You must have at least two servers in a cluster.
8. B. You can have two primary servers. It is important to note that the

data will not be synchronized between servers.


9. D, F. RISC processors were designed by IBM and Motorola.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Answers to Review Questions

127

10. A. RISC stands for reduced Instruction Set Computing.


11. C. CISC stands for Complex Instruction Set Computing.
12. B. All of the items are important things to consider but an often-

overlooked item is the DC power supply. Generally manufacturers


will supply this item as a part of the equipment, but its important to
validate first that you indeed need one and second that youll get an
additional one with your order.
13. C. If the CPU speed is mismatched, the system wont operate.
14. A. RISC processors often show two to four times the performance

of CISC processors in comparable technology and using the same


clock rates.
15. D. The only upgrade that would be necessary would be to upgrade the

BIOS on the mainboard to support Streaming Single-Instruction,


Multiple-Data Extensions.
16. C. The Pentium III had ID tracking, which people felt could cause an

invasion of privacy.
17. B, C, D. The stepping number isnt nearly as important as the speed

and L2 cache of the processors. Check manufacturer documentation


for compatibility rules. The motherboard must be able to accept the
additional CPUs.
18. B. An arbitrated loop has shared bandwidth, meaning there are more

than two nodes connected.


19. C. Fibre Channel is defined by the ANSI X3T11 Fibre Channel Standards

and Draft Standards.


20. A. The N_port is usually a termination card.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Chapter

Memory
SERVER+ EXAM OBJECTIVES COVERED IN
THIS CHAPTER:
 3.4 Increase memory.


Verify hardware and OS support for capacity increase.

Verify memory is on hardware/vendor compatibility list.

Verify memory compatibility (e.g., speed, brand, capacity,


EDO, ECC/non-ECC, SDRAM/RDRAM).

Perform upgrade checklist including: locate and obtain latest


test drivers, OS updates, software, etc.; review FAQs,
instructions, facts and issues; test and pilot; schedule
downtime; implement using ESD best practices; confirm that
the upgrade has been recognized; review and baseline;
document the upgrade.

Verify that server and OS recognize the added memory.

Perform server optimization to make use of additional RAM.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

n Chapter 3 we mentioned things like cache memory, but we


really have not had the opportunity to delve deeply into the subject. Now, if
you are one of those people who believe that memory is memory is memory;
I hate to be the one to burst your bubble, but wrongo! I am going to be
throwing around terms and acronyms like EDO, ECC, DIMM, SIMM, L1,
L2, RAID cache, writing back and writing through (also known as writing
thru), and a whole lot more.
In this chapter we are going to be looking at everything you need to know
about cache memory and how to configure memory hardware for optimum
server performance.

For complete coverage of objective 3.4, please also see Chapter 9.

Memory Types

ou have to remember where I am coming from. The very first computer that I ever bought came standard with 512K (thats right, K) of Random Access Memory (RAM). Now, I will never forget the look on the
salesmans face when I told him that I wanted to upgrade my system to 1
megabyte (MB) of RAM. He thought I was nuts. He actually told me that
I was throwing away my money, because there would never be a use for that
much memory. Now, some operating systems have minimum suggested
requirements of 128MB for installation. Guess my friendly computer salesperson was wrong!

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Memory Types

131

So, why is memory so important? If you want to speed up the performance of any PC or server, one of the first things you can do to it is add more
memory. As a matter of fact, that is a pretty common solution to server problems. It is always easier for a Central Processing Unit (CPU) to grab information out of memory than it is for the CPU to have to go look for it on a
hard disk, or in its instruction set. So, the more of the commonly referenced
information we can store in memory, the quicker the CPU can find it. The
faster the CPU does its job, the faster the server (or even a workstation)
appears. It is as simple as that. How does the CPU know what is commonly
referenced information and what isnt? It doesnt. So it just stores as much of
the stuff that people have asked for as it can. When people (or the system)
ask for stuff that it does not have in memory, it will usually rid itself of old
stuff that no one has asked for in a while and replace it with the new stuff
people (or the system) have recently asked for.

DIPs
Memory comes in various shapes and sizes, so lets start by taking a look at
some of the physical types of memory. I mentioned above that I had to
upgrade my first computer from 512K to 1MB of RAM. This involved a
technician adding some integrated circuits called dual inline packages
(DIPs) to the mainboard. These types of DIPs are shown in Figure 4.1. DIPs
have come in a variety of sizes, but now they are usually at least 256K per
DIP. To be honest, I have no idea what they were when I bought my first
computer, because for the first year I owned it, I was afraid to take the top
off for fear all the electrons would escape.
FIGURE 4.1

DIP memory chip

As the need for memory began to increase, a decision had to be made.


These DIPs plugged directly into the mainboard, and with four or eight of
them per megabyte, they began to take up some serious room. So, the choice
was to increase the size of the mainboard or find a different configuration for
memory. The designers opted for the different configuration of memory,
called the Single Inline Memory Module (SIMM).

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

132

Chapter 4

Memory

DIPs are still used for a variety of memory. For example, VGA cards or network
cards that have onboard cache will normally use DIPs.

SIMMs
The SIMM was just a different configuration of the DIP. Two types of SIMM
are shown in Figure 4.2.
FIGURE 4.2

SIMM memory

SIMMs were a breakthrough at the time of their introduction. The first


SIMMs had nine small DIP chips on them and took up less room than DIPs.
As a matter of fact, four SIMMs could be installed in the same space that
used to be allocated to a row of DIP chips. Installing SIMMs was relatively
easy: You placed the module in a slot at a 45-degree angle and gently pushed
down and rocked it back until it locked in place. There were some tricks with
SIMMs, the main one being that it always paid to buy the same type of
SIMM from the same manufacturer. The original SIMMs didnt work and
play well with SIMMs manufactured by different companies.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Memory Types

133

SIMM installation was actually a little more difficult than it sounds. Many of
the mainboards had plastic connectors, and if you were not careful, you could
break off the plastic. When that happened, the SIMM was not held securely in
its slot and it did not work well. This was usually time for a new mainboard,
and those were always expensive. For a while, I worked as a telephone technical support person, and my job was to talk people through the installation
of memory SIMMs. As a technician, I always warned the installer to be really
careful, and I just hated it when I heard something like, Oh darn, look what I
did coming out of the phone.

Getting back to Figure 4.2, you can see that the SIMM, depending on age,
comes in two different configurations. There was the 30-pin configuration
and the 72-pin configuration. When the 30-pin SIMMs first came out, computers were working with 32 data bits. Unfortunately, each SIMM only handled 8 data bits, so you needed to provide one bank of four SIMMs. A
memory bank was simply a set of four slots. Most computers had two banks
of four SIMMs available, Bank 0 and Bank 1. The CPU would then address,
or work with, one memory bank at a time.
72-pin SIMMS took care of part of the problem, because each 72-pin
SIMM supported 32 data bits. If you were using a 486 CPU from Intel or a
68040 from Motorola, you only needed one 72-pin SIMM per bank to give
the CPU the 32 data bits it was looking for.

Working with the early computers was always fun, because they never ceased
to provide unique opportunities. One of the opportunities was something
called chip creep. If you remember all the way back to high school science,
when things heat up, they expand; when they cool down, they contract. The
same is true with chips. After a computer had been turned on and off several
dozen times, the chips, which had expanded and contracted several dozen
times, may have worked themselves just ever-so-slightly out of their slots.
That meant the chip was not making proper contact and the thing didnt work
as advertised. As a user, you became adept at taking the top off your computer and gently pushing down on all the chips to reseat them.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

134

Chapter 4

Memory

DIMMs
After the SIMM came the Dual Inline Memory Module (DIMM). Look at
Figure 4.3.
FIGURE 4.3

Two types of DIMM

SO DIMM

168-pin DIMM

As you can see, there are two types of DIMM, but most of them installed
vertically into the mainboard, just like the SIMM. The difference between
SIMMs and DIMMs is in the pin configuration. On a SIMM, the opposing
pins on either side of the board are tied together to form a single electrical
contact. With a DIMM, the opposing pins remain separate and isolated to
form two contacts. DIMMs therefore usually have memory chips on both
sides of the module. DIMMs are used in 64-bit computer configurations.
This relates to the Intel Pentium or the IBM RISC processor.
At the top of Figure 4.3 is the Small Outline DIMM or the SO DIMM.
This DIMM is like a 72-pin SIMM in a reduced size. It is designed primarily
for laptop computers.
Next is the 168-pin DIMM. If you look carefully at it, you will notice the
notches in each side of the module. Instead of having to install this module
by inserting it at a 45-degree angle and rocking it back, this module slides
into its slot with rocker arms on each side. You start the installation by opening the rocker arms, and when you push the DIMM into the slot, the rocker

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Cache Memory

135

arms close and lock the module down. The rocker arm will then hold the
module firmly in place, eliminating any chip creep.
Now that we know what memory physically looks like, lets see how it
is used.

Cache Memory

For your basic server, there are two types of memory: cache memory and
main memory. Main memory is referred to by a variety of names, including
Dynamic Random Access Memory (DRAM) or just plain ol RAM. DRAM is
the part of memory that is responsible for holding instructions and for holding
data that will be used by the applications running on your server. It is also used
by the server operating system itself. When the servers CPU executes an instruction from an application, it goes out to RAM to see if there is information stored
in memory that it can use. DRAM is kind of the holding area for information
that may be accessed in the near future. Depending on the server, the amount of
DRAM can measure in the gigabytes.
There is another type of RAM, called Static Random Access Memory
(SRAM). Your first question is probably, Wait a minute. How can it be
static and random at the same time? Good question! SRAM is called static
because the information doesnt need to be updated very often. With memory, this update process is called a refresh. SRAM is usually physically bulky
and limited in its capacity. SRAM usually comes in a DIP. SRAM can be used
for cache. Lets start looking at cache and then we will explore the different
types of main memory.
Cache comes in much smaller amounts and it is much faster than main
memory. It is usually measured in the kilobyte range. The express purpose of
cache is to make it easier for the component to respond to request for services. Cache memory is used for the processor, it is used for RAID controllers, and it is even used for some types of network cards. In this section, we
are going to look at how a processor uses cache, how RAID uses cache, and
the differences between write-back and write-through cache. (You
might have noticed these quoted terms can also be spelled write-back and
write-thru. Either way, they mean the same things.)

Processor Cache
When a processor wants to access information, it wants that information as
quickly as it can get it, by using the fewest number of clock cycles. When you see
listings for the cache memory that will be used expressly for the processor, note
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

136

Chapter 4

Memory

that it comes in either Level 1 cache (L1), or Level 2 cache (L2). Level 1 cache
is physically in the actual processor itself. Level 2 cache is usually part of the
mainboard and is dedicated strictly to providing memory for the processor.
Cache memory is always SRAM. SRAM can be on a DIP, a SIMM, or a
DIMM. The cache memory controller is the brains of the cache memory system. When the cache memory controller goes out to get an instruction from
the main memory, it will bring back the next several instructions and keep
them in cache, also. This happens because it is very likely that these instructions will also be needed. Because the instructions are already loaded in
memory, when the CPU makes the call for them, the instruction will be read
from cache, making the computer run faster. When the computer runs faster,
the user is happier and the network administrators life is easier.
So whenever you see the term cache, remember that this is just another
way to speed things up. Any time something can be read from memory,
rather than having to go to the hard disk or to the BIOS to find the information, it is going to take less time. Cache is just a segment of memory that has
been reserved by the component involved to temporarily store information
or instructions for faster retrieval.
Now, another place where cache memory is used is in RAID systems.

RAID Cache
RAID cache is a perfect server implementation for cache to come into play.
Think about it. You have several hundred people trying to access information from a RAID system or write information to the subsystem, almost
simultaneously. Now, I dont know about the users you have dealt with, but
the users on my network never understood the word patience. The biggest
complaint, it seemed, that people had was the speed of the network. Certainly, those few extra seconds were going to materially affect the standard
of living of some of these people!
Anyway, assume that we are looking at a busy server, or a server that is
dealing with lots of small I/O reads and writes. In this case, if the RAID controller were to become overwhelmed with work, it might have to put some
of the requests on hold while catching up. This is usually not a good thing.
So, the RAID controller uses cache as sort of a waiting room for requests. If
it cannot answer the request immediately, it may place the request in cache
until it can get to it.
As you imagine the differences between L1, L2, and RAID cache, notice
that one of the biggest differences is in size. L1 and L2 are measured in kilobytes while RAID cache is measured in megabytes. As a matter of fact, several
RAID controllers have minimum sizes before the cacheing will kick into effect.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Cache Memory

137

Write-Back vs Write-Through Cache


Here we go back to the CPU again. Since we are discussing writes here, we
are looking at how information is being written to the cache memory the
CPU actually uses. There are two ways this can happen, with write-through
or by using write-back. Here is how the two differ:


With write-through memory, each time there is a write operation to


the cache, it is accompanied by an operation that writes the same data
to main memory. If you have a system that is using write-through
cache, then an I/O processor does not have to look in the cache directory when it reads memory. After all, everything is in main memory
and main memory is usually faster than SRAM. This method makes
the access for the I/O processor simpler, but it has some high traffic
loads between the CPU and the memory, and with high traffic comes
lower I/O performance.

Write-back memory, on the other hand, has the CPU updating the
cache during the write, but the actual updating of the main memory is
postponed until the line that was changed in memory is discarded
from cache. At that point, the data that has been changed is written
back to main memory.

If you are looking for a performance comparison, write-through cacheing


provides somewhat better performance because it reduces the number of
writes to main memory. With the performance increase, however, there is a
catch. There is a slight risk that the data may be lost if the system crashes.

Real World Scenario


As you read through this, I am sure you are asking yourself when you may
have to actually do anything with cache. The answer, at least in my experience, is not very often, if at all. This is one of those topics that it is great to
understand, because it will affect how your server performs, but you will
probably never have to "do" anything with it.
Knowing how these things work will help you when you compare specifications on various components. As a rule of thumb, information stored in
memory is faster to access than information stored anywhere else. Therefore, in this case, more is usually better! If you are comparing two RAID
controllers, for example, and one comes with 4 MB of cache, and the other
comes with 16 MB of RAM, you can expect the controller with 16 MB of
RAM to provide better performance. You can also expect it to cost more.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

138

Chapter 4

Memory

Main Memory

Now we have to look at how all the other memory within the system
works.
So, what do you know about main memory so far? Just that the CPU uses
memory to store information and instructions that it may need later. The
more memory you have, obviously, the more instructions or information can
be stored there. But how is it stored there? We have already looked at one
memory technique when we looked at cache. Lets take a look at a couple of
other ways that memory is used. We are going to look at paged memory,
interleaved memory, and shadow memory before we get into error correction, parity, and all that fun stuff. Some of this stuff may not be part of the
objectives, but, for example, you have to understand how paged memory
works before you can understand why interleaved memory is better.

Paged Memory
A year ago, I bought a file server to use in my lab. When I bought the server,
one of the things that the base server was short on was memory. When I
checked the vital statistics on the server, the marketing information said that
it takes just plain old standard memory, so I figured memory is pretty cheap.
I can slap a few DIMMs in there and bring it up to where I want it to be.
Since it is a lab server, it is older and I bought it for a very good price, so I
figured I could add memory without a problem. When I received the server
and also received the technical specifications, it called for fast paged mode
(FPM), error correction code (ECC) memory. That stuff is pricey. Now,
instead of looking at $150 for a 128MB SIMM, I was looking at $500 for a
pair of SIMMs that equal 128 MB. So I did some research and this is what
I found.
Typical memory access by the memory controller is handled in a way that
is similar to reading a book. Just like reading this book, if you want information on paged memory, you access this page. With memory, if it wants
access to certain information, it just accesses the memory page. Once the
page has been accessed, then the information can be gathered in. This process works just great when you are talking about workstations that dont
necessarily have to access information out of memory a lot. When you start
talking about a server, this is a different story. See, with straight paged mode
memory, every time the system wants a bit of information it has to go and
access the appropriate page. With fast page mode memory, this delay is overcome by letting the CPU access multiple pieces of data on the same page,

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Main Memory

139

without having to relocate the page each and every time. This works as long
as the read and the write cycles are on the loaded page.
Fast page mode has certain benefits. For example, there is less power consumption because the pages will not have to be located or sensed each time.
I am pretty sure that the implementation of FPM will not amount to a massive reduction in our electric bill. FPM also has some drawbacks, not the
least of which is price. So, if you can, avoid my mistake, and avoid FPM.
Paged mode memory, on the other hand, simply divides up the RAM in
your system into small addressable groups or pages. The pages can be from
512 bytes to several kilobytes long. The improved memory management on
mainboards has now advanced to the point that it is very similar to fast page
mode, where subsequent memory access from the same page is accomplished
without the CPU having to wait for the memory to catch up. This is referred
to as zero wait state. If the access does take place off the current page, there
may be one or more wait states added while the new page is located.
Now dont confuse paged mode memory with the way Microsoft
Windows 2000 and Novells Netware 5.1 use page files to increase memory.
Before we get into interleaved memory, lets look at that.

Memory Using Page Files


I first came across the concept of page files with Windows NT 4, and I have
to admit I was pretty impressed with the unique solution to an old problem.
The problem with servers has always been that you always seemed to be
short of memory. Back in the early days of local area networks, RAM was
selling for several hundred dollars a megabyte and the mainboards could
only address so much memory. As the CPUs got faster and computers got
smarter, the total amount of usable memory grew, and the price of memory
came down. Of course, at the same time, the operating systems got fatter and
started taking up more memory. So, as a system administrator, you wanted
to provide great response to your users, and you knew that the cheapest way
of doing that was to add more memory, but it didnt take too long before you
simply ran out of room. So there had to be another way. That is where somebody started poking around your average computer and started thinking,
What is cheaper than memory (though not as fast) and can be extended
almost without limits? Disk space! Hey, disk space is a lot cheaper than
DRAM, and it can be in abundant supplysurely there has got to be a way
to use disk space in place of memory. And that was the birth of page files.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

140

Chapter 4

Memory

Page files work like this. When your server comes up, the network operating system (NOS) takes a look at the amount of free space that you have
on your disk subsystem and the NOS then takes part of that free space and
creates what is called a page file. The page file is only for the use of the operating system. This isnt a high-level secret place for network administrators
to store stuff. Now, as we have seen, as the server gets busy, it takes information that it needs, and moves that information into memory. Because it is
a busy server, the length of time instructions can stay in memory may be
exceptionally limited. When it comes time for the information to be flushed
from memory, the system has two choices: It can flush the information from
memory, so the next time it needs that information it can go back to the
application to locate it, or it can move the information to the page file, where
it will be more readily available.

Page files and virtual memory have a language all their own. For example, a
page fault occurs when your system is looking for information in RAM and
cannot find it, so it has to refer to the page file. This is referred to as a page
fault. Page faults then come in two varieties; soft page faults and hard page
faults. Information in memory is stored in frames. When the information is
moved to page files, there has to be a place to temporarily keep that data, and
these are called free frames. The plan is that these frames will be moved into
buffers and then written to disk before replacement data comes along. If a
page fault occurs, and the data is in one of the free frames that has not actually
been written to disk, this is called a soft page fault. If the data has already been
written to disk, it is a hard page fault. Soft page faults are handled more
quickly than hard page faults.

Page files are just near-line storage for information that otherwise would
be stored in memory. So, take a look at Figure 4.4 and you will see how the
CPU uses its memory.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Main Memory

FIGURE 4.4

141

CPU memory access

CPU

Cache

Main memory

Disk

Page files, or disk swapping, are part of a concept called virtual memory.
The virtual memory concept works like this: In a 32-bit computer, the maximum amount of memory that can be conceived is 4GB. The page file system
or the disk swap space is just an area on the hard disk that can be used as an
add-on to the main or physical memory. The page file then is all the memory
that can be used, and the physical memory is the memory that physically
exists. Look at Figure 4.5 and see if that makes it any clearer.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

142

Chapter 4

Memory

FIGURE 4.5

Virtual memory

Virtual memory space

Disk swap
space

Main
memory

Virtual memory space:


All possible memory
addresses
(4GB in 32-bit systems)
All that can be conceived
Disk swap space:
Area on hard disk that can
be used as an extension of
memory
(Typically 100MB)
All that can be used
Main memory:
Physical memory
(Typically 64MB)
All that physically exists

So, with virtual memory space, you are dealing with an address that can
be conceived of but doesnt really correspond to any real memory. If something tries to access it, that attempt generates an error. With page file or swap
file space, if the address is read, the information is on the disk, so it has to be
moved to main memory. This is faster than searching an entire disk because
the memory table has the actual disk location mapped.
Finally, there is the main memory. When the processor wants something
from main memory, it is available immediately.
So, page file memory isnt really memory, kind of. It is virtual memory.
Lets get back to the real stuff and look at interleaved memory.

Interleaved Memory
In the eternal quest to make things faster, the next step up the memory food
chain is interleaved memory. The whole reason for using interleaved memory is
that provides faster response time than paged memory. Check out Figure 4.6.
This is the way paged mode memory accesses information, one step at a time.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Main Memory

FIGURE 4.6

143

Non-interleaved memory

CPU

Bus

Cache

Bus

Memory

Compare that to Figure 4.7, which shows the way interleaved memory
accesses four memory chips.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

144

Chapter 4

Memory

FIGURE 4.7

Interleaved memory

CPU

Bus

Cache

Bus

Memory Bank 3

Bus

Memory Bank 2

Bus

Memory Bank 1

Bus

Memory Bank 0

Interleaved memory combines two banks of memory into one. The first
section of memory is even and the second is odd, so memory contents are
alternated between these two sections. When the CPU begins to access memory, it has two areas that it can go to. With faster processors, they dont have
to wait for one memory read to finish before another one can begin. This
means, for example, that memory access of the odd portion can begin before
memory access to the even portion has completed.
The good news is that interleaving can double your memorys performance. The bad news is that you have to provide twice the amount of memory in matched pairs. Just because your PC says it uses interleaving and
allows you to add memory one bank at a time, do not be confused. The computer is simply disabling interleaving and you may notice a degradation of
system performance.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Main Memory

145

Shadow Memory
Besides the various types of RAM being used on your server, there is also
memory that is read only. Not surprisingly, it is referred to as Read Only
Memory (ROM).

Testing Tip: If you are like me, once you walk into a testing room, you start to
freeze up and question everything, thereby confusing yourself! And, I tend to
forget what it is that certain things dofor example, Read Only Memory. One
of the things that I have found helpful is to pay close attention to what the
words mean, because unlike marketing or management speak, computerese
tends to be very descriptive. I mean, when you see ROM, if you know the acronym stands for Read Only Memory, you have a really good clue what that stuff
is used for. If it had been named by someone in marketing or management, it
would have been called something like silicon-enhanced, integrated longterm memory paradigm used only for perusal and not for continuous reconfiguration in this regard unless we have at least three meetings. You get my
point!

ROM devices are things like the Basic Input/Output System (BIOS) on
your mainboard. These devices tend to be very slow, with access times in the
several hundreds of nanoseconds. Because your CPU is much faster than
that, ROM access requires your CPU to go through a large number of wait
states before returning instructions, and that just slows down the whole systems performance. How big of a deal is that? Well, think of the things that
have their own BIOS:


Mainboards

Video cards

SCSI controllers

These are things that will be accessed very frequently, so you can see where
it could become an issue. Some computers use a memory management technique called shadowing. When shadowing is employed, the contents of
ROM are loaded in to an area of the faster RAM during system initialization. Then the computer maps the fast RAM into memory locations used by
the ROM devices. After that is done, whenever the ROM routines have to be
accessed, the information is taken from the shadowed ROM rather than
accessing the actual IC. In this way, the performance of the ROM can be
increased by more than 300%.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

146

Chapter 4

Memory

Real World Scenario


When you start talking about main memory, there are some tips that you
should keep in mind, especially when dealing with a server.
As I have alluded to time and again in this book, the two key factors of a
server are availability and performance. One of the ways that you can
impact both these factors is with judicious use of RAM.
Most server platforms will be designed to handle a specific type of memory.
When you buy your server, load up on as much RAM as you can afford from
the manufacturer. When the server arrives (or when you document your
server) make sure to document what type of memory, manufacturer, part number is installed, as well as the size and speed of the SIMM. You should also document how many (if any) open slots you have available for expansion.
If you notice that performance on your server has degraded, do a baseline
check on how memory is being used. This can be accomplished in many
ways, depending on the operating system. If your NOS uses a page file,
check to see how much the page file is being accessed. If it is getting
accessed very, very often, you need to add memory to your server. Making
sure that you add the right kind of memory to the server will insure that the
server will operate at peak efficiency and provide you with the most bang
for your buck!

Memory Error Checking

If you pause a minute and take a look at the big picture, you are going
to see that we are talking about some pretty serious stuff. We are talking
about information sets on how the computer will operate, as well as program
information, and data is being moved into and out of memory at a rapid rate.
If that information is not moved correctly, nothing works properly, and your
life is not very much fun. So, it is vitally important that all of the instructions
and all of the data remain error-free. Think about all the things that can
result in corrupt instructions:


Electrical noise

Component failure

Corrupt drive information

Video problems

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Memory Error Checking

147

Parity in the memory subsystem works like this: When a byte is written to
memory, it is checked, and a ninth bit is added to the byte as a checking or
parity bit. When the CPU needs to access the information from memory, the
CPU runs the numbers and calculates the expected parity bit. At that point,
the parity bits are compared and, if they match, the information is deemed
correct. If the parity bits do not match, the system comes up with an error
and, depending on the sophistication of the system, it may actually halt.
Every byte is given a parity bit. If you are working with a 32-bit PC, there are
4 parity bits for every address. If the PC is a 64-bit model, the number of parity bits increases to 8.
There are two types of parity: even parity and odd parity. With even parity, the parity bit is set to 0 when the number of 1s in the byte is even. That
will keep the number of 1s in the calculation even. If the number of 1s in the
byte is not even, then the parity bit will be set to 1, thus making the number
of 1s even.
The reverse is true with odd parity. In this case, the system wants to make
sure there is always an odd number of 1s in the byte. So, if the number of 1s
in the byte is odd, the parity bit is set to 0. If the number of 1s in the byte
is even, the parity bit restores order by being a 1.
If you look at this, you are going to notice that even and odd parity are
exactly opposite, and that is OK. It does not matter in the greater scheme of
things.
Like most things that are simple and are free, parity has some shortcomings.
First of all, when it discovers a problem, it cannot fix the problem. It only
knows that one of the bits in the byte has changed, but it doesnt know which
bit and it doesnt know if it changed from 0 to 1 or from a 1 to a 0. Also, what
happens if 2 bits are corrupted? If a 0 gets changed to 1 and another 0 gets
changed to a 1, as far as parity is concerned everything is wonderful.
Given this scenario, like most things in computing, someone decided there had
to be a better way, and that better way was called Error Correction Code (ECC).

ECC Memory
Like everything else, memory schemes evolve, and people whose priorities
are high availability and high reliability understand that higher cost usually
follows. ECC memory works in conjunction with the mainboard memory
controller to add a number of ECC bits to the data bits. Now, when data is
read back from memory, the ECC memory controller can check the ECC
data read back as well.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

148

Chapter 4

Memory

This means that ECC memory is superior over memory with just parity
for two reasons. First, ECC memory can actually correct single-bit errors
without bringing the system to a halt. It can also detect when there have been
2-bit, 3-bit, or even 4-bit errors, which makes it a very powerful detection
tool. If there is a multi-bit error detected, the ECC memory will report the
error and the system will be halted.
There is some additional overhead with ECC. It takes an additional 7 or
8 bits to implement ECC.

Have you ever wondered if there was a way to determine whether your system has parity or ECC memory? There is. All you have to do is count the number of memory chips on each module. Parity and ECC memory modules have
a chip count that is divisible by 3. Any chip count not divisible by 3 indicates
that the memory module is non-parity.

Extended Data Out (EDO) Memory


Several times during the first four chapters I have mentioned that if you
really want to know what something does, take a good close look at the
name. This is another example. Extended Data Out (EDO) RAM just keeps
information hanging around by lengthening the time the information is valid
in memory; therefore the datas existence on the data bus is extended. This
is done by changing the DRAMs output buffer and prolonging the time that
the read data is valid. With EDO, the read data is valid until the motherboard signals that it doesnt need the information any more. This helps ease
some time constraints on the memory and also allows an improvement in
performance with little or no increase in overhead or cost. There is an external signal that is needed between the motherboard and the memory, so the
motherboard must have an EDO-compliant chipset.
EDO RAM can be used in non-EDO motherboards, but obviously there
will be no performance improvement.

Unbuffered vs Buffered vs Registered


As we continue with everything you ever wanted to know about memory but
didnt know who to ask, we are now going to look at how the memory controller accesses and writes information onto the memory chip. This is done
in one of three ways: unbuffered, buffered, or registered.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Memory Error Checking

149

Unbuffered memory
Unbuffered memory talks directly to the chipset controller. There is nothing
standing between the memory module and the controller. Therefore, information is written quickly to memory, with very little overhead.

Buffered
Buffered memory is a DIMM that has a buffer chip on it. If you are using a
DIMM with lots of chips on it, it requires a lot of effort on the part of the system
to write information into memory. Some manufacturers will use a re-drive
buffer on the DIMM to just boost the signal and reduce the load on system. The
buffers are overhead and therefore they introduce a small delay in the electrical
signal.

Registered
With registered memory, the DIMM contains registers that will re-drive or
enhance the signal as it goes through the memory chip. Because the signal is
being enhanced, there can be a greater number of memory chips on the
DIMM. Registered memory and unbuffered memory cannot be mixed.
Just like buffered memory, registers slow things down. Registers delay
things for one clock cycle to make sure that all communications from the
chipset have been collected. This makes for a controlled delay on heavily
used memory.

Hardware Compatibility Lists


Whenever I think of hardware compatibility lists (HCL), I think of one of my
favorite presents, which came from my youngest daughter. I cant remember
if it was for birthday, Fathers Day, or just to shut me up. You see, my youngest daughter is a tech-weenie wannabe. She is not there yet, but she works
with her mother, who is totally computationally challenged, in an office
where the entire technological skill level is understanding how to turn the
computers on. Denise has been forced into the role of on-site technical person while I am off traipsing around the country teaching people how to build
and administer networks. Anyway, as Denise grew in her role, she kept calling with questions. Since I was usually on a short break from the classroom,
I usually didnt have much time to spend with her, so we would do the quick
fix and if that didnt work, my next response would be, So, what did the
manual say? She began to tire of that, and like most people in their 20s
when dealing with their parents, she was not afraid to let her exasperation
show. So, we simplified it. She would ask a question and I would reply,
RTFM, meaning Read The Fine (or some such) Manual. This was not an
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

150

Chapter 4

Memory

answer she liked, but she did it, and lo and behold, the number of calls
decreased and her skill level increased. She began to see the humor in the
whole thing, and that is why she gave me a shirt monogrammed with RTFM,
and the note that said I should wear that during difficult classes.
Now this really is a do-as-I-say-and-not-as-I-do situation, because I have
been there, done that, got the T-shirt, and therefore should not have to read
the manual. Every time I take that attitude, I am immediately shot down by
doing something incredibly stupid (and usually costly) to prove the point.
So, let me put it to you this way. Whether you have just gotten a new copy
of the SuperWhizBang 6000 Operating System, or you need to put a card in
a computer, it never hurts to check the hardware compatibility list to see if
that card will actually work in the system. Or, if you really want to be daring,
you can read the compatibility list before buying the card, thus saving yourself time and frustration. These things are written for a reason, and they are
usually on the Internet or come with the program. Check to make sure your
system meets minimum requirements and you will save yourself tons of
headaches later.

Real World Scenario


Here is another case of true confessions. Earlier in the chapter, I talked about the
lab server that I have that uses really expensive memory. Well, truth be told, I
have a really well equipped lab, including several components that have not
been installed into servers and will probably never be installed in servers. My
wife keeps referring to that as wasted money. I tend to think of it as a lesson that
I should have learned, and probably didnt
Hardware compatibility lists are wonderful things. You see, if I would take
the time to check the HCL for a particular NOS, I would have found out that
many of my orphaned components would not work with the NOS. If I had
checked the HCL before I took advantage of the great deal that I found, I
would have saved myself tons of money. Unfortunately, I tend to purchase
first and check later, which is really a bad habit.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

New Stuff

151

Now, when I am buying for my lab, the worst thing that is going to happen is that
I will have a stack of components on my bakers rack that will probably never see
the inside of a server. If I were doing this at a client site, I would have wasted the
clients money and time, not to mention shooting my credibility! When I make a
purchase for a client, I am very careful to make sure that the component appears
on the HCL. If the proposed solution does not appear on the HCL, I then check the
manufacturers web site to see if there is support available. If there is alleged support available, I then download the drivers and test the component in a lab
machine before trying to install it in a production environment. Remember, we
want our servers to be high availability, and if we take a server down, we have to
make sure that the time out of service is used to the best advantage.

New Stuff

here are some other memory technologies whose names you may run
into that we havent covered here: Rambus (RDRAM) memory, Double Data
Rate SDRAM (DDR SDRAM), and IBM Memory Expansion Technology
(MXT). Since two of these types of memory require special mainboards, it is
important that you know what the specifications mean before you fill out a
purchase order for new memory for your server.

RDRAM
RDRAM is an Intel invention that got off to a rocky start. And its life hasnt
been too great either. Rambus was originally supposed to be the next great
memory advance, but then it got bogged down in life. Delivery was late,
there were squabbles between Intel and memory manufacturers that led to
lawsuits, and then when Rambus finally did hit the market, performance was
nowhere near expectations. In published reports, Intels own benchmarks
showed that less-expensive SDRAM technology running at 133MHz outperformed RDRAM running at 800MHz.
When Intel brought RDRAM to market, they wanted the manufacturers
to pay a licensing fee to Intel for the technology. Well, since the margin in
memory is nonexistent and the traditional memory shopper is looking for
price as well as performance, this strategy did not go over well.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

152

Chapter 4

Memory

DDR SDRAM
DDR SDRAM is like normal SDRAM in many ways. For example, it works
with the front-side bus clock in the system. The memory and the bus run
instructions simultaneously. This means that, as bus speeds have increased,
so has system performance.
The big difference between the two is the way that DDR reads the data.
It has found a way to effectively double the speed of the SDRAM. This means
that if the data rate is usually 133MHz, DDR will transfer data at a rate of
266MHz.
DDRs also come in DIMMs, but they will not fit in the standard SDRAM
slot so you have to use a specially designed mainboard. The same problem
with configuration carries over to the laptop market. The SO DIMMs will
need a specially designed mainboard also. The DIMMs will have different
notchings and a different number of pins.
DDRs come in ECC for servers, and non-ECC for workstations.

DDR vs Rambus
In a study done by InQuest Market Research in November 1999 (http://
www.inqst.com/ddrvrmbs.htm), it was reported that the performance differences were negligible between Rambus and SDRAM.
Yeah, but that is SDRAM. What are the performance statistics for Rambus and DDR? InQuest used a benchmark called the SteamD that has been
released by the University of Virginia. This benchmark is designed to evaluate the bandwidth of memory to the processor. The margin of error for this
benchmark is less than 1%. In this study, DDR beat out Rambus by a significant margin in all tests, exceeding 30% in some cases and averaging 24.4
% performance advantage for this benchmark.
There is another version of the testing suite, this one to show the memory
types that work with Windows. This benchmark is WSTREAM.EXE.
According to the developers, the compound precision error rate is in the
range of 30%, and the developer has said that the program is inaccurate
under Windows NT 4. In tests using Windows 98, InQuest showed that the
DDR performance advantage had decreased to just 2.7%.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Increase Memory

153

Keep in mind, this study was done in the fall of 1999, and the way memory
technology has changed, all bets could be off by the time you read this. As a
matter of fact, at the time this study came out, neither Rambus or DDR had
been released to the general public yet. Do your research before filling out the
purchase order for any new technology.

IBM Memory Expansion Technology (MXT)


In June 2000 IBM introduced an MXT, which IBM claimed could effectively
double the amount of memory in servers based on Intel processors. MXT is
based on a new cache memory design that compresses data down to half its
size. IBM is predicting that this will save large implementations thousands
and sometimes hundreds of thousands of dollars. In IBMs example, using
MXT with a typical rack-mounted server configuration using Windows
2000 can allow an implementation with 84GB of memory to act like a system with 168GB of memory, saving the company about $250,000.

More information on IBM research is at www.research.ibm.com.

Increase Memory

et me make a very bold statement. Id say that at least 2550% percent of


todays servers are RAM-starved. What I mean by that is that they dont have
enough system memory to handle the load theyre under. Oftentimes, when a
server is RAM-starved, it can use physical disks as an extra memory source. This
technique is called paging or virtual paging. But when a server is paging, the
speed of the system slows dramatically down. It is always better to have a server
utilize system memory than to have it page to disk for part of its memory.
Usually its easy to cure RAM-starvationsimply add more system RAM.
But there are some concerns youll have to address before going forward.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

154

Chapter 4

Memory

Verifying Hardware and OS Support


The first and probably most important question youll have to get the answer
to is whether the current computer has any space available for RAM
upgrades. Sometimes a servers RAM slots are maxed out and you cant add
any more RAM. In an instance like that, your next option is to replace the
server or see if you can replace some slower peripherals with newer faster
models. While peering into the computer to find out if youve got available
open slots, check to see what kind of memory is currently installed in the
computer and how many open slots you have.
Youll also want to check the specifications of your operating system to
make sure it can work with the amount of RAM youre planning on adding.
Generally this isnt going to be an issue, but its worth checking anyway.

Verify That Memory Is on Compatibility Lists


After determining that you have open slots and making a decision about
how much memory youre going to order and from which manufacturer, you
should check the compatibility list for your server to make sure your memory
decisions are appropriate. Some computer manufacturers require that you
use their memory in their systems. If you try to buy memory that they havent
officially blessed, you may run into trouble. Its very important, especially
with tier 1 vendors (such as Dell, IBM, Compaq, HP, and others) that the
memory youre planning on purchasing will work in their computer.
I dont think its wise to trust third-party clone memorymemory thats
designed to emulate proprietary memory chips that manufacturers recommend in their systems. Ive had very mixed luck with clone memory chips and
tend to prefer to use the real McCoy even though its a bit more expensive.
Why the care to make sure the memorys compatible? Because if you
have memory troubles on down the road, you might have a lot of trouble
diagnosing the problem. Its important to always be sure that youre putting equipment into the computer that the manufacturer has officially
said will work. Otherwise, youre setting yourself up for many problems.

Verifying Memory Compatibility


The next item on the list has to do with the kind of memory youre going
to buy. Its not enough to know that you want to buy a certain kind of memory
with certain specificationsyoull also need to know what youre pairing the
new memory with in the computer. For example, if your system currently has
72-pin 100 nanosecond (ns) DIMMs (dual inline memory modules) inside it,

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Increase Memory

155

you cant go putting 70ns DIMMs in the computer and hope that things work
correctly. As a general rule of thumb, youll want to closely match whats
already in the computer. There are several things to consider:
Capacity What is the capacity of the RAM thats currently in the computer? What is the maximum RAM capacity that the computer is capable
of handling? If your computer can handle a maximum of 128MB of RAM
and youve already got 64MB in the computer you can only add 64 more
megabytes to the computer before its satiated.
Brand If your manufacturer documentation doesnt have any particular
brand in mind for RAM upgrades, be sure that you pick a known reputable
vendor for your RAM. Dont try to short sheet your server by purchasing
from an unknown vendor so you can save a buck or two. Youll likely find
that the RAM doesnt work correctly and that youll have lots of problems
with it.
Speed What is the speed of the RAM, in nanoseconds, thats currently in
the computer? You cannot mix and match RAM speeds. Its vital that you
match the RAM speed currently in the computer with the speed youre
planning on adding.
EDO Extended Data Output (EDO) RAM has the capability of retrieving
the next block of data at the same time as its sending the previous data
block to the CPU. Do not mix and match EDO and non-EDO RAM. You
might experience difficult-to-diagnose erratic activity with the computer
after upgrade.
ECC/Non-ECC Error Correcting Code (ECC) memory has the ability
to check the validity of the data as its passing into and out of the chip. Its
not as vital to make sure you dont mix up ECC with non-ECC memory.
You may want to consider purchasing all ECC memory for your server
and throwing away any non-ECC chips you might encounter.
SDRAM/RDRAM Synchronous Dynamic RAM (SDRAM) has the
capability of running substantially higher clock speeds than older RAM
chips. Newer SDRAM chips can run at a systems 100MHz bus speed,
thus producing significantly faster throughput. But they bog down when
running much faster than 100MHz. Rambus Dynamic RAM
(RDRAM), a RAM chip invented by Rambus, Inc. (www.rambus.com)
can run at phenomenally higher clock speedsa maximum of 600MHz
as of this writing. Thus, as newer system buses come out that are capable
of running at higher clock speeds, RDRAM can keep up with the activity. Another kind of RAM, a competitor to RDRAM being designed by

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

156

Chapter 4

Memory

a consortium of computer manufacturers and called SyncLink Dynamic


RAM (SLDRAM), will be introduced soon.
Contacts The kinds of contacts used on the RAM chips are important as
well. For example, some RAM chips have gold contacts. It isnt vital to
match the type of contact currently in the system, but understand that gold
provides a better electronic contact point than inferior conductor materials
such as aluminum.
You must ascertain the kind of RAM currently in the system and compare
with the kind youd like to purchase for the computer to make sure that you
wont run into any conflicts between the old and the new.
You should also note whether you need to buy the RAM in pairs or not. In
some computers the RAM is installed in pairs of chips while in others you can
buy a single chip. For example, some servers allow you to purchase one DIMM
at a time while other computers require that you purchase two DIMMs (and
occasionally may require that you install them in matching slots).

Oftentimes a computer manufacturers Web site will list the kind of memory
that originally shipped with the computer, thus giving you some documentation that you can utilize when purchasing compatible additions.

Perform an Upgrade Checklist


As always, its a great idea to write out a checklist that you can use. In
your checklist, put a place for the old RAM so that you can write down all
of its characteristics. Then supply a place where you can match the new
RAMs characteristics with the old.
If in doubt about a RAM chip, call the vendor or manufacturer for information on the chip. If you cant ascertain a chips characteristics, Id advise
buying all new chips and forgetting about pairing with the old ones.

Verify That Server and OS Recognize the Added Memory


Servers (computers in general) are really funny about adding memory. You
put the new memory in the computer and watch carefully as you start it up.
Youll generally see the computers power-on routines count the memory
and note the changes. But then you might be surprised to see an error on the
screen telling you that the BIOS has encountered memory mismatch errors.
Youll be prompted with a key to hit to enter the BIOS configuration utility.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Summary

157

This isnt any big deal. Just go into the BIOS, verify that the new memory
size has registered, and then exit, saving changes (being careful not to change
any other BIOS options!). The server will restart and this time youll see it
successfully count and pass through power-on without generating any more
errors.
Once the OS has loaded, verify that it sees the correct amount of memory
as well. If you encounter any problems, note any errors that are reports in the
logs. Ive never had a problem with an OS not recognizing the proper
amount of RAM if the BIOS has successfully noticed and registered it.

Perform Server Optimization to Make Use


of Additional Ram
Finally, there are some things you can do OS-wise to make sure that the additional RAM is utilized properly. One thing that you can check is whether applications are optimized to use the correct amount of system memory. Oftentimes
applications will have configuration settings that allow you to tune how much
physical memory can be used at any one time. Once you upgrade memory, it
may be time to check these settings and make sure theyre still correct, or update
as needed.
Also you can check the page file size and cut it back a notch or two. You
may also be able to tune the virtual paging settings so that the system makes
optimal use of RAM rather than paged memory.

Summary

his chapter has centered on memoryall different forms of it and all


different uses for it. The information you have just gone over should make
it much easier for you to determine the proper type of memory to put into
your server to get peak performance. Memory is one of the key ingredients
to all server operating systems. In most cases, when you want a performance
increase, whether at the CPU or in main memory, the slogan seems to be,
Add more memory.
Before increasing the memory in a server, its important to ascertain the kind
of memory thats currently in the computerhow much capacity, the speed of
the RAM, what kind of RAM it is. You should also validate with the server manufacturer what brands of RAM are supported for addition to the server. Some
manufacturers prefer that you use their brand of RAM and look with disfavor

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

158

Chapter 4

Memory

on third-party RAM manufacturers. Check your documentation! There are key


things you need to match when shopping for new memory: the memorys speed
and whether its EDO, ECC, SDRAM, etc.
You also need to verify that the BIOS is aware of the addition and checking
to make sure the OS sees the new RAM upgrade was well. We also discussed
optimizing RAM usage by tweaking applications to use more RAM and by
adjusting virtual paging settings.

Exam Essentials
Know the differences between L1 and L2 processor cache. L1 cache is
actually on the processor. L2 cache is usually part of the mainboard and
is used exclusively for the processor, but it is not part of the processor. L1
and L2 cache are measured in kilobytes.
Know why Raid uses cache. RAID systems use cache to improve
throughput and speed up disk reads and writes. RAID cache is measured
in megabytes.
Know the difference between write-back memory and write-through
memory. Write-through memory writes information to cache and to
main memory at the same time. Write-back memory has the CPU updating the cache during the write, but the actual updating of the main memory is postponed until the line that was changed in memory is discarded
from cache.
Know how memory interleaving works; know how paged memory
works. When your server comes up, the network operating system
(NOS) takes a look at the amount of free space that you have on your disk
subsystem and the NOS then takes part of that free space and creates what
is called a page file. The page file is only for the use of the operating system. The page file is then used to hold information from memory that may
be used again in the near future.
Know the difference between page faults, soft page faults and hard page
faults. A page fault occurs when your system is looking for information
in RAM and cannot find it, so it has to refer to the page file. This is
referred to as a page fault. Page faults then come in two varieties; soft page
faults and hard page faults. Information in memory is stored in frames.
When the information is moved to page files, there has to be a place to
temporarily keep that data, and these are called free frames. The plan is
that these frames will be moved into buffers and then written to disk
before replacement data comes along. If a page fault occurs, and the data
is in one of the free frames that has not actually been written to disk, this
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Summary

159

is called a soft page fault. If the data has already been written to disk, it
is a hard page fault. Soft page faults are handled more quickly than hard
page faults.
Know the difference between ECC memory and EDO memory. ECC is
error-correcting memory. EDO memory just lengthens the amount of
time information can be stored in memory before it is sent to a page file,
or discarded.
Know the difference between unbuffered, buffered and registered memory.
Unbuffered memory writes information directly to the chipset controller.
Buffered memory uses buffered chip to boost the signal and ease the strain
on the system. With registered memory the DIMM contains registers that
will re-drive or enhance the signal as it goes through the memory chip.
Know when to use a hardware compatibility list. Whenever you add
hardware to a server, check the NOS hardware compatibility list. If the
component does not appear on the HCL, check the components manufacturers web site to make sure the appropriate drivers are available.
When in doubt, dont install the device.
RAM upgrade. Know and understand how to upgrade system RAM
and what components to check for when shopping for upgrade RAM.

Key Terms
Before you take the exam, be certain you are familiar with the following terms:
Basic Input/Output System (BIOS)
buffered memory
cache memory
Double Data Rate SDRAM (DDR SDRAM)
Dual Inline Memory Module (DIMM)
dual inline package (DIP)
Dynamic Random Access Memory (DRAM)
error correction code (ECC)
even parity

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

160

Chapter 4

Memory

Extended Data Out (EDO)


fast paged mode (FPM)
hard page fault
hardware compatibility list (HCL)
interleaved memory
kilobyte
Level 1 cache (L1)
Level 2 cache (L2)
main memory
megabyte (MB)
Memory Expansion Technology (MXT)
non-ECC
odd parity
page fault
paged mode memory
parity
RAID cache
Rambus Dynamic RAM (RDRAM)
Random Access Memory (RAM)
Read Only Memory (ROM)
registered memory
Single Inline Memory Module (SIMM)
Small Outline DIMM
soft page fault
Static Random Access Memory (SRAM)
Synchronous Dynamic RAM (SDRAM)
SyncLink Dynamic RAM (SLDRAM)

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Summary

unbuffered memory
write-back memory
write-through memory
zero wait state

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

161

162

Chapter 4

Memory

Review Questions
1. What does DIP stand for?
A. Dual internal processors
B. Dynamic Induction Processing
C. Dual Inline Package
D. Dynamic Inline Package
2. SIMMs came in which pin configurations?
A. 28-pin
B. 30-pin
C. 64-pin
D. 72-pin
3. With how many data bits were computers working when the 30-pin

SIMMs first came out?


A. 16
B. 32
C. 48
D. 64
4. What was the minimum number of 30-pin SIMMs that you needed to

install to account for the proper number of data bits?


A. 1
B. 2
C. 4
D. 8

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

163

5. How many data bits were supported on each 72-pin SIMM?


A. 32
B. 36
C. 72
D. 144
6. What was the minimum number of 72-pin SIMMs that needed to be

installed to account for the proper number of data bits?


A. 1
B. 2
C. 4
D. 8
7. Thirty-pin SIMMs were installed in memory banks. What were the

names of these banks?


A. Bank 1 and Bank 2
B. Bank 0 and Bank 1
C. Bank A and Bank B
D. First Bank and Second Bank
E. U.S. Bank and Wells Fargo
8. What is DIMM an acronym for?
A. Dips used In Memory Modules
B. Dual In-place Memory Maker
C. Dual Inline Memory Maker
D. Dual Inline Memory Module

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

164

Chapter 4

Memory

9. Other than physical appearance, how is a DIMM different than a SIMM?


A. Each DIMM supports a minimum of 64MB of memory.
B. Each DIMM has 32 DIPs on it.
C. The opposing pins remain separate and isolated to form two contacts.
D. DIMMs use only gold contacts where SIMMs use either silver or gold.
E. DIMMs hold cache memory; SIMMs hold RAM
10. What type of computer uses DIMMs?
A. 8-bit
B. 16-bit
C. 32-bit
D. 64-bit
11. What type of DIMM is used in a laptop?
A. Double-edged DIMM
B. 64MB
C. SO DIMM
D. SEW DIMM
12. How many pins are on the DIMM used in servers?
A. 168 pins
B. 164 pins
C. 162 pins
D. 160 pins
13. There are two types of RAM. Please select them from the list below.
A. DIMM
B. SIMM
C. DRAM
D. SRAM

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

165

14. Bonnie is attempting to add some system memory to a server. The

systems bus accepts 72-pin DIMMs. Bonnie is attempting to add


one new DIMM to the existing system. Once she adds the new
memory she gets no errors, but the computer doesnt recognize the
new memory. What could be causing the problem?
A. Incorrect speed on new DIMM.
B. Brand of DIMM isnt compatible with rest of architecture.
C. Incorrect capacity of new DIMM.
D. Current DIMMs arent ECC.
E. Old DIMMs have silver contacts.
15. Suzanne is working on a server that has four slots in it for DIMMS.

Two of the slots have 64MB DIMMS in them already. Suzanne wants
to add a 128MB DIMM, giving the system a total of 256MB of total
system memory. When she adds the DIMMS, the power-on self-test
memory count shows the full 256MB but she now gets an error telling
her to adjust the BIOS. What could be the problem?
A. Nothings wrong.
B. Cant pair DIMMS of different capacities.
C. First two DIMMS are ECC DIMMS, new ones not.
D. First two DIMMS are silver-tipped, new ones not.
16. You are going to install interleaved memory. What is the minimum

number of memory modules you must install to make sure that


interleaving is actually working?
A. One
B. Two
C. Four
D. Eight

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

166

Chapter 4

Memory

17. What does ECC stand for?


A. Error Correcting Cache
B. Error Checking Cache
C. Error Checking and Correcting
D. Error Correcting Code
18. How does ECC differ from parity checking?
A. Parity checking can correct a single bit that has been changed.

ECC cannot.
B. ECC is cheaper because the code is actually embedded into a code

chip on the memory module.


C. ECC memory can actually correct single-bit errors. It can also

detect when there have been 2-bit, 3-bit, or even 4-bit errors.
D. There is no difference.
19. You have a memory module with nine chips on it. What kind of

memory is it?
A. Either parity or non-parity
B. Non-parity only
C. ECC only
D. ECC or parity
20. You have a server that is RAM-starved. You purchase a DIMM from

a reputable memory manufacturing company, install it, and find that


the system wont boot up. What could be the problem?
A. The type of memory you bought isnt supported by the computer

manufacturer.
B. System requires DIMMs to be installed in pairs.
C. Youve exceeded the systems memory capacity with the DIMM

youre adding.
D. System BIOS needs to be adjusted.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Answers to Review Questions

167

Answers to Review Questions


1. C. A DIP is a dual inline package.
2. B, D. SIMMs came in 30-pin and 72-pin configurations.
3. B. When the 30-pin SIMMs first came out, computers were working

with 32 data bits.


4. C. Each SIMM handled 8 data bits, so you had to install SIMM in

groups of 4.
5. A. Each 72-pin SIMM supported 32 data bits.
6. A. With 72-pin SIMMs, just one module would provide the necessary

32 data bits.
7. B. The two banks were referred to as Bank 0 and Bank 1.
8. D. DIMM stands for Dual Inline Memory Module.
9. C. With a DIMM, the opposing pins remain separate and isolated to

form two contacts.


10. D. DIMMs are used in 64-bit computer configurations. This relates to

the Intel Pentium or the IBM RISC processor.


11. C. Small Outline DIMM or the SO DIMM. This DIMM is like a 72-

pin SIMM in a reduced size. It is designed primarily for laptop computers.


12. A. There are 168 pins on a DIMM.
13. C, D. The two types of RAM are Dynamic Random Access Memory

(DRAM) and Static Random Access Memory (SRAM). Choices A and


B are types of memory modules, not RAM.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

168

Chapter 4

Memory

14. A, B, C, D. If Bonnie is trying to put a 100ns DIMM in a system that

has 70ns DIMMs currently installed, she could run into trouble.
Also, if the server shes trying to upgrade has proprietary memory in
it, she could create some problems by not buying manufacturerrecommended DIMMs for the system. Additionally, its not a wise
idea to match ECC with non-ECC memory and so forth. Generally
its a good idea to ascertain whats currently in the system and match
accordingly. The kind of contacts each DIMM has shouldnt affect
the systems operation.
15. A. In almost all cases, after you add memory to a system, you have to

go into the system BIOS and acknowledge that the current memory
count is correct.
16. B. Interleaved memory combines two banks of memory into one, and

therefore you must have at least two modules.


17. D. ECC is the acronym for Error Correction Code.
18. C. ECC memory can actually correct single-bit errors without bringing

the system to a halt. It can also detect when there have been 2-bit, 3-bit,
or even 4-bit errors, which makes it a very powerful detection tool.
19. D. Where the number of chips is divisible by 3, it can be either ECC or

parity memory.
20. A, B. First of all, you should always consult the manufacturers guide-

lines before upgrading RAM in a system. Some RAM that can be


bought from third-party vendors simply wont work very well in some
systems. Some systems require that you install memory in pairs. Youll
have a RAM bay A and RAM bay B and youll have to buy two
DIMMs and place one in A and one in B to make it work. Either of
these situations, or both, could be your problem.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Chapter

System Bus
Architecture
SERVER+ EXAM OBJECTIVES COVERED IN
THIS CHAPTER:
 3.7 Upgrade peripheral devices, internal and external


Verify appropriate system resources (e.g., expansion slots,


IRQ, DMA, etc.)

Perform upgrade checklist including: locate and obtain


latest test drivers, OS updates, software, etc.; review FAQs,
instructions, facts and issues; test and pilot; schedule
downtime; implement using ESD best practices; confirm
that the upgrade has been recognized; review and baseline;
document the upgrade.

 3.9 Upgrade service tools (e.g., diagnostic tools, EISA


configuration, diagnostic partition, SSU, etc.).


Perform upgrade checklist including: locate and obtain latest


test drivers, OS updates, software, etc.; review FAQs,
instructions, facts and issues; test and pilot; schedule
downtime; implement using ESD best practices; confirm that
the upgrade has been recognized; review and baseline;
document the upgrade.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

h, yeah, now we are a getting into it. If you take the case
off a server, one of the first things you are going to notice is that big green
board that everything else plugs into. Call it a mainboard, call it a motherboard, call it whatever you want, that is where all the information must
pass to go anywhere. Everything else we have been talking about wont
work at all if there is something wrong with the motherboard. In this chapter we are going to look at what sets this component apart and how all the
various parts come together to communicate.

For complete coverage of objective 3.9, please also see Chapter 9.

System Bus Architecture

If you have taken the A+ exam, some of the information we are going to
cover will probably be review, but that is not a bad thing. Review can usually
help us all! We are going to start this section by talking about bus basics and
then move into the way the Peripheral Component Interconnect (PCI) local
bus works. At that point, we will cover most of those objectives listed above.
All of these topics relate to information flow and the speed with which it
moves through the server and out to the network.

Bus Basics
You may have noticed that in Chapter 4, Memory, there were several references to the bus speed that went without explanation. My reasoning was
that I would use that as a promo for the good stuff in Chapter 5. So, what is
a bus? A bus is a set of signal pathways that allow information to travel
between the components that make up your computer. These components can

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

System Bus Architecture

171

be inside the computer or, as we saw in the discussion of SCSI in Chapter 1,


even outside the computer. When we talk about buses, we are generally talking
about one of three types: the external bus, the address bus, and the data bus.
In Chapter 4 we talked about cache memory and the fact that L2 cache
was external to the processor. The processor will use an external bus to get
information to and from the cache memory. The external bus is what the
CPU uses to communicate with all the other peripherals in the computer.
When the CPU wants to talk to a device, it will make use of the address bus.
In this case, the CPU selects a particular memory address that the device is
using, and sends information to that address using the address bus. When the
device responds to the CPU, it will use the data bus. How does the device get
the CPUs attention? Just like a small child, it uses the theory of interrupts.

This discussion is going to make use of some terms that were defined and discussed in earlier chapters: DMA and Bus Mastering. Because these are important
concepts, we will take a couple of sentences to review them here. A Direct Memory Access channel is a channel that a peripheral device can use to write specifically to a set memory address. DMA channels cannot be shared, and the device
does not use the CPU to access memory. Bus Mastering is similar. It is the ability
of a device to perform its function without needing to access the CPU. It writes
information to memory without accessing the CPU.

Interrupts
Interrupts are amazing things. When you have installed a component properly,
it is up to the interrupt to get the attention of the CPU when the component has
information or data to send. If the card is not installed properly, and you have
chosen an interrupt that is already being used, the card will either not function
or the system will completely lock up. Fortunately, with the PCI Bus, each
expansion slot (rather than the card) is assigned an interrupt, so the problem
of misconfigured components has been minimized.
When a card or peripheral has some data to send, it uses something called
an interrupt requestor (IRQ) line. The IRQ is kind of like a student in class
holding up her hand to get the attention of the instructor. In this case,
though, the peripheral is trying to get the attention of the CPU.
Each type of bus has several different types of IRQs, and some of these
IRQs are reserved. For example, IRQ 0 and IRQ 1 are used by the processor
for special processor stuff. The other IRQs can be allocated depending on the
peripherals that are installed.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

172

Chapter 5

System Bus Architecture

IRQs are finite, meaning there are only a few that can be used. If your server
has several different peripherals that are not PCI, you could conceivably run
out of IRQs.

Interrupts are part of the expansion of a PC. Adding a card or a component


into a connector (called an external bus) on the motherboard usually does
expansion. The external bus that you are most likely familiar with is the expansion bus. The expansion bus is that motherboard component that will allow you
to expand the services provided by the computer simply by plugging in a specially designed circuit board. These boards are plugged into connectors or
expansion slots, and this connection allows them to communicate with the CPU.
The expansion buses come in various flavors; for example, there were the
old 8-bit and 16-bit ISA Buses, and there is the PCI Bus, the EISA Bus, and
the Micro Channel Architecture from IBM. In this chapter we are going to
look at PCI and EISA primarily, but it never hurts to have some background.

Expansion Slots
If you were to look closely at an expansion slot, you would see that it is made up
of several tiny copper finger slots. Each finger slot has a row of very small channels that make contact with the fingers on the expansion circuit board. These
finger slots are then connected to the pathways on the motherboard, and each of
the pathways has a specific function. One of the pathways provides the power
necessary to run the expansion card. Another set of pathways is the data bus,
which, as the name implies, transmits data to and from the processor. Another
set of pathways makes up the address bus. The address bus, you will remember,
allows the device to be addressed, or contacted, by the CPU, using a set of Input/
Output (I/O) addresses. There are also pathways for things like interrupts, direct
memory access (DMA) channels, and clock signals.
It is really pretty easy to tell what type of expansion bus you are using, just
by looking at the motherboard. As you have probably figured out by now,
I am big time into the history of computing, and it is never a bad thing to
know where the industry has come from. Some of these you may never see,
unless you go through a hardware museum, but it never hurts.
ISA 8-Bit Bus
Back in the early days, the expansion bus was only 8 bits wide and had a
blazing speed of 4.77 MHz. There were just eight interrupts (we will talk
about those in just a few pages) and just four DMA channels. By todays

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

System Bus Architecture

173

standards, this is slower than horse and buggy, but for the day it was blazing
fast. Since this was the very first PC Bus, and since it was designed by IBM,
the original makers of the IBM PC, they referred to this architecture as
Industry Standard Architecture (ISA). The 8-bit bus connectors are shown
in Figure 5.1, and an 8-bit bus expansion card is shown in Figure 5.2.
FIGURE 5.1

Eight-Bit bus connector

If you look carefully at the picture above, notice how wide those finger slots
were. We will be able to compare those with newer technology in just a second. Here is the type of card that took advantage of that slot (see Figure 5.2).
FIGURE 5.2

Eight-Bit bus expansion card

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

174

Chapter 5

System Bus Architecture

ISA 16-Bit Bus


As technology advanced, we went from an 8-bit to a 16-bit bus. This went
a long way to solving the biggest problem with the 8-bit busthe fact that
it was only 8 bits wide. The new ISA Bus had a 16-bit data path, and you also
may have heard it referred to as the AT Bus.
Look at Figure 5.3. Notice that the ISA Bus slots have a small bus connector behind the standard 8-bit bus. This connector provides several more
fingers, which are signal lines. The more signal lines available, the faster the
card can gather in information.
FIGURE 5.3

ISA Bus connectors

Motherboard

8-bit bus slots

So, what you have here is the old 8-bit slot with an add-on. The ISA Bus
also helped expansion by adding eight more interrupts and four more DMA
channels. It was quite easy to spot the kind of board that fit these new ISA
slotsthey looked like Figure 5.4.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

System Bus Architecture

FIGURE 5.4

175

ISA Bus expansion card

16-bit
connector

8-bit connector

If you look closely at the card, you will see that toward the front of the card
there is an 8-bit connector, separated from the read connector by a slot. This
architecture was really interesting because of compatibility. For example, if the
expansion card was an 8-bit card, it would run in either an 8-bit or a 16-bit
slot. It you had a 16-bit card, it would naturally run in the 16-bit slot for which
it was designed, but it would also run (albeit a lot slower) in an 8-bit slot. So,
pretty much everything was compatible with everything else.
Micro Channel Architecture (MCA)
About this time in the history of computing, the company that invented the
PC, IBM, was beginning to think the world was passing them by. Their share
of the market was steadily declining and they figured that they had to do
something to get it back. That something was the Personal System/2 (PS/2).
Along with the PS/2, IBM was introducing a new type of data bus called
Micro Channel Architecture (MCA). This bus was supposed to put the ISA
Bus out of business by utilizing a smaller connector with thinner fingers.
MCA was revolutionary because it was available in either 16-bit or 32-bit
versions. Secondly, it could have several Bus Mastering devices installed, and
the bus clock speed was about 25% faster than the old 8 MHz systems,
screaming along at 10 MHz. The really revolutionary part of the puzzle was
the way that you configured the expansion cards. In all the other bus technologies, the cards were configured by jumper settings, or by DIP switches.
With MCA, device configuration was done with software. These were the
first software-configurable expansion cards.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

176

Chapter 5

System Bus Architecture

This was an interesting concept, but it had some problems. First of all, all
device configurations were done from a single boot diskette that contained all
the information files for all the devices. When you made a change, the change
was not only written to the card, it was also written to this diskette. That diskette was the only diskette that knew what was in the system and how each
device was configured. At the time, I was working doing onsite hardware support. Whenever I ran into a PS/2 device (and some of them were servers) I
knew there was going to be trouble. I would ask for the configuration diskette,
and usually receive a blank look from the customer. It was then up to me to
find a PS/2 diskette and configure the entire system from scratch. Great idea,
but they forgot to take it that one extra step of either saving the configuration
files to a disk or making the devices able to provide configuration information
if asked by a setup program.

You could always tell an MCA card: They dont call IBM Big Blue for
nothing! Look at Figure 5.5.
FIGURE 5.5

MCA expansion card


Blue handle
Blue handle

As with most things that came out of IBM at the time, the MCA architecture was very proprietary. At a time when the buzzword was compatible,
IBM wasnt. In addition, IBM charged vendors who developed their own
expansion cards 5% of their gross receipts. Even way back then, margins
were slim on computer hardware, and this put the cost of MCA peripherals
out of sight.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

System Bus Architecture

177

EISA
Back in the late 80s to early 90s it was still a computer war out there. IBM was
selling PCs because they were IBM. The catch phrase at the time was, You
can never get fired for buying IBM. But there was competition, led by the
Gang of Nine. The Gang of Nine was made up of nine computer manufacturers that thought there had to be a better way than MCA to get faster speeds.
The Gang of Nine consisted of some of the top names in the industry at the
time: AST, Compaq, Epson, Hewlett-Packard, NEC, Olivetti, Tandy, Wyse,
and Zenith. They began to offer an alternative to MCA called Extended Industry Standard Architecture (EISA). For a while, EISA was popular in both 386
and 486 computers until about 1993 when PCI came along.
EISA had many of the same things going for it that MCA had, but it also
had compatibility with the older ISA board. Take a look at Figure 5.6 and
Figure 5.7.
FIGURE 5.6

EISA Bus connector

Now, one of the things you do not see in that picture is how deep the connector slots really were. They were about twice as deep as the old ISA slots
and 8-bits slots. Compatibility was done by staggering the finger slots. Look
closely at Figure 5.7 and you will see that some of the grooves are longer than
others.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

178

Chapter 5

System Bus Architecture

FIGURE 5.7

EISA expansion card

With this type of setup, if you were installing an 8-bit card, it would only
go so deep into the expansion slot. A 16-bit card would go as deep, but use
the back connector. An EISA card on the other hand, would slip all the way
to the bottom of the connector, making a 32-bit data path.
EISA Configuration

Are you familiar with Plug-and-Play hardware? Well, EISA was a precursor
to Plug and Play, and at the time, it was certainly a lot easier than other forms
of hardware installation.
Lets say that you were installing a new network card in an ISA-based
machine. Before you installed the network card, you had to check the computer to find out (at the very least) what interrupts were being used by other
devices. Then you configured the network card to use an interrupt that was
not being used by any other device, installed it, turned the computer on, and
ran the appropriate driver for the card. If you did your job right, the driver
would load and you had network connectivity. What usually happened, on
the other hand, was that you (okay, read that I) had guessed wrong and the
IRQ was already in use. This necessitated starting all over. Things changed
with EISA.
With an EISA Bus, on the other hand, you would take the top off the computer and install the EISA card in an EISA slot. The toughest part of the process was remembering what slot you installed it into. Anyway, after the card

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

System Bus Architecture

179

was seated, you turned the computer on, and as part of the Power On Self
Test (POST) the computer would figure out that there was something new,
different, and interesting going on inside. The computer would ask you to
configure the device, and you would use a program called EISA Configuration (EISA Config for short) to set the IRQ, DMA, and anything else you
needed to set. All this was done via the slot number, and the information was
then saved on the card. This made a technicians life remarkably easy,
because the EISA Config utility would even go out and check to find out
what settings were already being used. That way, you almost couldnt mess
it up.
The difference between configuring a machine with MCA and EISA was
that with MCA, you needed the diskette with the configuration utility for
that specific computer. Without the specific configuration disk, you reconfigured the whole machine. With EISA, you needed an EISA configuration
utility for that brand of computer. Sometimes, EISA configuration utilities
would even work across brands. So, if you carried around a diskette with the
Compaq EISA Config on it, you could configure all Compaq EISA machines.
There were other enhancements of EISA over ISA:


The CPU, DMA, and Bus Mastering devices could make use of a 32bit memory-addressing scheme.

The data transfer protocol that was used for high-speed burst transfers
was synchronous.

EISA supported intelligent Bus Master controllers for various peripherals.

EISA had better ways of handling DMA arbitration and transfer rates.

EISA made use of shareable interrupts.

EISA finally gave way to PCI. That is what the majority of systems on
the market are using today. Lets take a closer look at the new industry
standard bus.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

180

Chapter 5

System Bus Architecture

Real World Scenario


The preceding section contains a lot of material that you wont need to pass
the Server+ test, but it does give you a great background in the history of
hardware and how it has evolved. If there were certain parts that you should
pay close attention to, I would suggest looking over the material on the EISA
bus. EISA was a revolutionary architecture for its time, and there may even be
some EISA servers still trucking away in a production environment.
EISAs configuration utility was really a precursor to Plug and Play technology. The utility went out and found the devices that were installed into
the server and then helped you to configure those devices without the use
of jumpers or switches. All of that is taken for granted now, but for the
time, it was a radical concept.

Peripheral Component Interconnect (PCI)

ow that we have had a pleasant walk through memory lane, lets get
us closer to the present. When Intel released the Pentium processor, all of the
existing buses became instantly obsolete. Every bus up until this moment had
been of the 16-bit or 32-bit variety, and then along came the Pentium, which
was a 64-bit processor. Using a Pentium processor with a 16-bit or 32-bit
bus would be like pulling the engine out of a Ferrari and replacing it with
something from a Yugo. It just shouldnt be done and performance would
suffer greatly.
Peripheral Component Interconnect (PCI) works well with the current iteration of the processor. It can handle both a 64-bit and a 32-bit data path. It is
also processor independent, which means that it uses a form of Bus Mastering.

PCI Bus Mastering


Geek term alert! Bus Mastering is one of those terms that you tend to see
around every time you pick up a catalog or read a description of a new piece
of hardware. It is one of those terms that you may have heard and nodded
your head knowingly, all the while thinking to yourself, someday, I am going
to read a book where that term is really explained. That day is today.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Peripheral Component Interconnect (PCI)

181

Back in the early days of PCs it was up to the microprocessor in the computer to manage every byte that was moved along the data bus. It was up to
the microprocessor to read the byte from one device or from memory, decide
where that byte belonged, and then write the byte to the proper location. Soon
it became obvious that this was a whole lot of work that could be farmed out
to other devices. The microprocessor, for example, did not need to be handling
everything that went into and out of the expansion bus. After all, the microprocessor is supposed to be the manager of the operation, and all really good
managers know how to delegate responsibilities. Bus mastering is the result of
that delegation.
With Bus Mastering, the microprocessor does not have to be involved in
every transaction. It can delegate control to special circuits called bus controllers, and these bus controllers will direct traffic between different circuits. The
actual device that takes full control of the expansion bus is called a Bus Master.
The device that will end up receiving the data from the Bus Master is called the
bus slave. Some of the more politically correct systems may call the master and
the slave the initiator and the target.
So, the bus controller can manage multiple Bus Masters, and Bus Masters
take control of the actual expansion bus through a process called bus arbitration. Each type of bus has a protocol that is used for managing this arbitration
process. That protocol can be based in hardware or software, though it is usually
hardware-based.

PCI Bridges
Bus Mastering makes it sound like there is just one bus, and that is not even
close to the truth. The average PC has several buses and these are usually
operating at different widths and at different speeds. It is kind of a system
board designers hell. Somehow there has got to be a way to hook up all
those different types of buses together and get them to work in a cohesive
way. This really took the forefront when PCI was introduced, because
remember, PCI was designed to be processor independent.
The problem was solved with something called the PCI Bridge. Think
about what a bridge does in your world. It moves things from one location
to another over some kind of obstacle. That is just what a PCI Bridge does,
but it does it with data. The PCI Bridge moves that data from one system bus
to another system bus, and it is up to the bridge to handle all the gory details
of the transfer. This can include things like changing the data format and
protocols without making use of any outside hardware and software products. The bridge can be some form of standalone hardware, or it may just be
part of the chipset that makes up the PCs mainboard.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

182

Chapter 5

System Bus Architecture

PCI Bridges are really busy. In a typical system, for example, the bridge
can take moving information from the microprocessor bus to the high-speed
PC Bus and even to an old, outdated ISA compatibility Bus. PCI Bridges can
even link to other PCI Bridges to form a PCI-to-PCI Bridge, or a PPB.

How far can this go? PCI Bridges can be connected to other PCI Bridges up to
a maximum of 256 PCI Buses in a single PC. We will cover more on this when
we talk about Hierarchical Buses and Peer Buses.

Arbitration, 4-Bit Code, and Cache


Negotiation is the key! PCI uses a form of Bus Mastering that is arbitrated,
just like other types of expansion buses, but PCI has some advantages. The
first is that the PCI has its own bus command language that is a 4-bit code.
In addition, PCI Bus Mastering supports a secondary level of cache.
This is how it works: A PCI Bus Master board needs to take control of the
bus, so it sends the appropriate signal to the host. As soon as it receives confirmation to take control, the transfer is started. Every PCI board has its own
set of specific signals that are related to the slot the board is in. It can use
these slot-specific signals to ask for bus control and receive permission.
By using this system, there is a great deal of flexibility in the way priorities
are assigned to the computer. The system designer has the ability to change
the arbitration procedure to fit the needs of the expansion device rather than
the needs of the original, potentially outdated bus specification.
There are two special electrical signals that control Bus Mastering, the
Request (REQ#) and the Grant (GNT#). The process starts when the bus
master sends its Request signal when it wants to start controlling the bus. At
that point the circuitry called the central resource sends a Grant to the master
that gives permission to take control. Each PCI device gets its own dedicated
set of Grant and Request signals.

Hierarchical PCI Bus


Depending on the way the PCI Bridges are configured, you can have a server
that takes advantage of either a hierarchical PCI Bus or of a Peer PCI Bus.
With a hierarchical PCI Bus, the buses in the hierarchy operate concurrently.
That means that a PCI Master and a PCI target on the same PCI Bus can
communicate even if the other PCI Bus is busy. In this case, traffic between
devices on one bus is isolated from the traffic generated by all the devices on
the other bus, which should help to enhance system performance.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Peripheral Component Interconnect (PCI)

183

Other implementations of the Hierarchical Bus allowed devices on one


bus to communicate directly with devices on another. For example, National
Semiconductor released a PCI-to-PCI Bridge chip called the PicoPower Nile
II. This chip was designed to be used in high-end desktops, servers, and even
as part of multifunctional add-on cards. This chip provided additional PCI
Buses without system degradation.
The Nile II chip supported speeds of up to 120 MBytes/sec and allowed
for concurrent operation on both the primary and the secondary bus. This
bus hierarchy can support up to 256 PCI devices per bus, and there can be
a total of 256 buses. That totals over 65,500 devices.
In order to reduce the amount of time that the bus was waiting for other
devices, there were two special processes utilized. One was referred to as hidden arbitration and the other was bus parking. Hidden arbitration allowed
bus arbitration to go on while an initiator was performing a data transfer.
With bus parking, the PCI assigned the bus to a specific master, which was
usually the PCI Bridge. When the master had a request for the bus, the
request was granted immediately, since the bus was parked on it. The access
time to the bus in this case was as little as 60 nanoseconds (ns).
Take a look at Figure 5.8. This is a diagram of how a bridged PCI architecture works.
Bridged PCI architecture
Host Bus

Processors

Host-to-PCI
Bridge

133MB/sec

FIGURE 5.8

PCI Bus

Memory

Slots

PCI-to-PCI
Bridge

PCI Bus 133MB/sec

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Slots

184

Chapter 5

System Bus Architecture

You will notice that there is only one data path to get to the host bus.
Everything has to go through the PCI-to-PCI Bridge to reach the primary bus
and then through the Host-to-PCI Bridge. While this method does provide
for a great number of devices, there is no load balancing capability.
Lets see what it is like with the Peer PCI Bus.

Peer PCI Bus


The other way that a primary and secondary bus can connect to the host is
with a Dual Peer PCI Bus. Take a look at Figure 5.9 to see how the Dual Peer
is configured.
FIGURE 5.9

Dual Peer PCI Bus

Host Bus
540MB/sec

Processors

Slots

Bridge

Hostto-PCI

Bridge

133MB/sec

Memory

133MB/sec

Secondary PCI Bus

Slots

Primary PCI Bus

PCI-to-EISA

Bridge

33MB/sec

Slots

EISA Bus

You will notice that, in this case, the two PCI Buses are linked independently to the processor bus using two Host-to-PCI Bridges. Since there are
two independent buses, there can be two Bus Masters transferring data at the
same time, giving more overall throughput and a higher bandwidth. This is
especially useful if you have a server with two or more peripherals that are
bandwidth intensive. If you split the peripherals between the two buses, you
are in effect creating load balancing.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Peripheral Component Interconnect (PCI)

185

If you are using a server that makes use of the Peer PCI Bus, there has to
be configuration of the Input/Output (I/O) subsystems. The load balancing
configuration should be taken into account even before the initial system
setup and configuration takes place.
This bus balancing is accomplished by actually balancing the I/O bandwidth for each bus. This should produce the optimal performance on a system. This will work great with Peer PCI Buses, but it may not work as well
with a bridged PCI system. Here are some recommendations on when to do
load balancing.


If you are using a bridged architecture, load balancing is not recommended. With a bridged architecture make sure the primary bus is the
first one that is populated.

For Dual Peer architecture, load balancing is recommended. If the I/O


throughput is high, then the workload should be spilt between the
buses.

If your Dual Peer architecture also makes use of PCI Hot Plug slots,
there is going to be some tradeoff between high availability and high
throughput.

Bus Balancing
Here are some guidelines on how to balance a PCI load:


If you have several network or array controllers, make sure they are
split between the buses.

If you are installing an odd number of controllers, like two network


interface cards and a single drive array controller, split the two network cards between the buses. Network controllers use more bandwidth than drive controllers, so split the workload between the buses.

Avoid putting two network cards on the same bus, unless both buses
already have a network card installed. It is better to have a system that
has a dual-port network card on each bus, rather than to have two
individual network cards on each bus.

So, how does the processor bus know when the network cards need attention? PCI Buses do use interruptsthey just use them a little differently.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

186

Chapter 5

System Bus Architecture

PCI Interrupts
PCI is a self-contained expansion bus. The interrupts that would normally be
set at the card level are managed at the expansion-slot level by the software
that drives the peripheral devices. With PCI, there are four level-sensitive
interrupts that have interrupt sharing enabled, and these can amount to up
to 16 separate interrupts when examined as a binary value. The PCI specification does not define what the actual interrupts are for each slot or even
how they are to be shared. All of that design relationship is left up to the person who is designing the expansion device. That means that these details are
usually not handled at the hardware level, as was the case in the earlier architectures. With PCI devices, the software device driver for the board handles
the interrupt configuration. These interrupts are really independent in a way,
because they are not synchronized with any of the other bus signals and so
they can be activated at any time.

PCI Expansion Slots


Earlier in the chapter you saw examples of ISA, and EISA slots as well as an
MCA card. Take a look at Figure 5.10.
FIGURE 5.10

PCI Bus connector

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Peripheral Component Interconnect (PCI)

187

As you can see, the finger slots in the bus are very small, and packed very
closely together. These expansion slots are usually white, and they are
divided into two sections.
There are two different kinds of PCI expansion slots and the voltage the
slots use differentiates the versions. One of the types uses +5 volts DC to
power the expansion card, while the lower-voltage model uses 3.3 volts.
When you look at the connector for the buses, the only differences are the
positioning of the blocker in each connector. This blocker, or key, keeps the
3.3-volt card from being plugged into a 5-volt slot.
Now, you have been wondering why I spent all that time covering some
of that other stuff on PCI Bridges and what have you. Well, when we talk
about expansion slots, one of the first questions that comes to mind is how
many can you have. The answer to that is, It depends. As you start stuffing
more and more stuff in a smaller and smaller space, something has to give,
and it is usually the electrical effects inside any given system. Because PCI
operates at a high bus speed, it is especially susceptible to high frequencies,
radiation, and other forms of electrical interference. The current standards
for a local bus limits to three the number of high-speed devices that can be
connected to a single bus.
If you paid close attention there, you notice that the standard calls for just
three devices, not three slots. Most local bus systems now have their video
display built into the motherboard. That circuit counts as a local bus device,
so, if your PC has video on the motherboard, you can use two local bus
expansion slots.
The limit of three devices comes from speed considerations. The bigger or
larger the bus, the more connectors there are. More connectors means that
any signal placed on a circuit will degrade more quickly, and the only way
to beat the degradation is to start with more signals. Somewhere, someone
had to draw the line, and the line was drawn at three devices.
While it seems that three devices may be limiting, it is not. Remember our
discussion of PCI Bridges? Well, since the three-device limit is per expansion
bus, the PCI Bridge allows multiple expansion buses to be linked together.
Each of these will use its own bus control circuitry. While this may sound
complicated, it is one of those things that doesnt really make any difference.
After all, as long as it works, that is all that counts, and the design is all in
the chipset.
Is there a way you can use this technology not only to increase the performance, but also to increase the availability?

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

188

Chapter 5

System Bus Architecture

PCI Hot Swap


If you remember way back in the book to Chapter 1 when we discussed
RAID technology, I also touched on things like a hot-swappable drive or a
hot-pluggable drive. In case you missed it, here is the premise. It used to be
that servers and networks were fragile and everyone took it for granted that
they went down occasionally and that was a fact of life. That was then, this
is now. Now, the server is an integral part of the operation of any company.
Companies can lose hundreds of thousands of dollars an hour in productivity if their servers are down. The goal, as far as management is concerned, is
to keep the servers up.
Now, you and I are networking professionals. We both know that no matter
how carefully you plan, no matter how carefully you design, no matter how
many precautions you take, life happens and things like components fail. This
used to be expected and it was the norm to schedule an outage, take a server
down for a couple of hours during the slow periods, and then have it back up
and working for Monday morning. More and more shops are working 24 hours
a day, seven days a week, so any downtime needs to be severely limited.
Hot swap, in theory, is simple. Why take the entire server down when all
you have to replace is one device or one card? Why not just replace the card
and be done with it? That is the basis for hot swapping. Basically, there must
be the ability to replace a plug-in board in the system while the power is on
and active. The goal is to have the maintenance people just remove the failed
component and insert a replacement. If any other procedure is completed, it
can create an operator error.
Now, if you are talking about hot swapping a hard drive immersed in a
RAID array, that is one set of challenges. Electronically, though, it is uncomplicated. You unplug the drive from the electronics, unplug the connection
from the drive controller, and remove the drive. Then, you insert the new
drive, hook up the controller cable, and apply power. The power surge
would be similar to turning a computer on and the controller would be notified the new drive is online. With PCI, however, you are removing a card
from a live slot, and replacing a card in a live slot. I can see there would be
some serious challenges.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Peripheral Component Interconnect (PCI)

189

When you are talking about hot swapping PCI devices, there are two sets of
specifications, the Hot Swap PCI Specifications and the CompactPCI Hot Swap
Specifications that are managed by the PCI Industrial Computer Manufacturers Group (PICMG). The two standards are very similar and they differ in only
a couple of areas. For example, with the PICMG Specifications, the backplane
that the device plugs into is passive, and all the logic is contained on the
adapter card. This same logic is used to power up the adapter card.

Making it easier for the developers is the fact that the devices are controlled
by software. It is up to the system software to provide the smarts for this whole
process to work.

System Software and Hot Swapping


When you start looking at this kind of awareness, it must be instituted at the
operating system or at an application. Either way, that piece must communicate with the software components running on a card. The card and the application can work together to shut things off and make it possible to replace the
PCI card. The PCI Hot Swap Specifications lays out six different classes of
software that are Hot Swap compliant. The minimum requirements allow for
just live insertion and live extraction. This is done with two levels of software.
The bottom layer of the software has three hot swap performance grades that
are applied in either a specific use (proprietary) or a general use category.
Specific use software is implemented when the operating system does not
specifically support PCI Hot Swapping. The application software that is
developed in this category cannot necessarily be moved to other platforms.
General use software, on the other hand, is fully compliant with, and usually integrated into, the operating system. This provides for the widest support of platforms and also gives the programmers of board drivers a specific
set of application programming interfaces (API) to use. When it is put
together it looks like Figure 5.11.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

190

Chapter 5

System Bus Architecture

FIGURE 5.11

Hot swap software implementations

Application

Device Services Requests/Grants

Hot plug
service

OS Calls Service

OS

Primary Command Interplay

Hot plug system driver(s)

Hot plug system driver(s)

PCI hardware

Within the general use and the specific use categories are three levels that
define how the live-insertion capability is carried out. These levels are Basic
Hot Swap, Full Hot Swap, and high availability.
Basic Hot Swap The end user must tell the operating system that a card
is going to be inserted or removed. This is usually done from the system
console.
Full Hot Swap This category adds to the functionality of Basic Hot
Swap. In this case, there is a microswitch added to the cards injector/ejector mechanism. This way, the technician does not have to tell the operating system that the change is about to occur. When the card is installed or
removed, the switch changes the electrical configuration and gives the OS
a warning that the process is about to occur.
High availability This level provides the greatest functionality for reconfiguring software while the system is running. This allows for on-the-fly
reconfiguration of both the hardware device and the software components.
In this case, the operating system itself can sense when a card has failed, and
the OS will bring a previously installed replacement card online to assume
the duties of the failed device.
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Peripheral Component Interconnect (PCI)

191

Hot Plug PCI


The goal of Hot Plug PCI is the same as the goal for hot swap basically,
keeping a system that needs to have high availability available. In the case of
the revision 1 of the PCI Hot Plug Specifications, giving the chipset on the
device a way to deal with various software requests does this. These requests
include things like these:


The ability to choose to isolate a card from the logic on the system
board.

The ability to choose to remove or apply power to a specific card


connector.

The ability to choose to turn on or turn off an Attention Indicator that


is associated with a specific card. This attention indicator can then be
used to draw the users attention to the connector.

With Hot Plug PCI, the user cannot remove or install a PCI card without
first telling the software. Once the software has been notified, it performs the
steps necessary to shut down the card connector so the card can be removed
or installed. It is up to the operating system to visually let the end user know
when it is all right to install or remove the card.
The advantage of Hot Plug is that you can use any PCI card in the system.
Changes are needed to the chipset, the system board, the operating system,
and the drivers.

I2O: Intelligent Input/Output


Intelligent Input/Output (I2O) is designed to provide a dedicated I/O processor on certain devices, like a server motherboard or even on network
cards or disk controllers. The standard is going to provide an I/O architecture that is independent of both the hardware device that is being controlled
and the host operating system.
I2O is made up of three software layers: the OS Services Module (OSM), the
I2O Messaging Layer, and the Hardware Device Module (HDM). The OSM is
designed to manage the communication between the host CPU operating system
and I2O Messaging Layer. This layer is unique for each operating system and
also for each device class. The I2O Messaging Layer is in the middle, handling
communications between the OSM and the HDM. The Messaging Layer frees
the device manufacturer from having to develop different drivers for the same
device. Finally, the HDM passes messages between the peripheral device and the
I2O Messaging Layer. The HDM has to be unique for each device but, if you
have several of the same kinds of devices, only one HDM is required. The HDM
is independent of the host CPU operating system.
Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

192

Chapter 5

System Bus Architecture

I2O can provide the server with the following benefits:




Provides an I/O architecture that is both extensible and independent of


specific hardware or operating systems.

It improves the throughput of the server, because the I/O of the peripheral
is removed from the CPU.

Provides an industry standard architecture.

Can increase fault isolation and recovery, which works to provide for
higher availability.

Provides for direct management of the I/O.

AGP
In each of the previous sections we have been talking about throughput,
especially when it comes to network interface cards and disk controllers. In
each of these sections we stressed how important it was to off-load the mundane tasks from the processor, in effect, giving it more time for the serious
processor tasks.
This section is going to take a somewhat different tack, concentrating on
video. Now, this is not necessarily a topic I normally associate with servers,
because usually servers are servers and high-definition graphics is not all that
important.
AGP is short for Accelerated Graphics Port. It is an interface based on
PCI and designed for the throughput demands of high demand video like
3-D graphics. Rather than using PCI for graphics data, AGP has a dedicated point-to-point channel to directly access main memory. AGP runs at
66 MHz over a data channel that is 32 bits wide, providing bandwidth of
266 MBytes/second. This compares to the PCI bandwidth of 133 MBytes/
second. In addition, there are two optional, faster video modes, providing
throughputs of 533 MBytes/second and 1.07 GBytes/second. AGP can support this kind of throughput by storing some of the 3-D textures in main
memory rather than in video memory.
Why would you install AGP on a server? Well, if there are 3-D applications that you have to run on the server, AGP will help off-load some of the
work that is placed on the CPU. If there is a 3-D application running, the
CPU (without an AGP graphics controller) is responsible for performing all

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Peripheral Component Interconnect (PCI)

193

those intensive 3-D calculations. The graphics controller can process the texture data and the bitmaps. At this point, the controller has to read information from several different textures and then average the information into a
single pixel on the screen. While this calculation is being performed, the pixel
is stored in the memory buffer. Since the textures are very large, there isnt
room in the video cards buffer. AGP overcomes this shortcoming by storing
the image in main system memory.
When AGP wants to access the texture data, it uses a process called Direct
Memory Execute (DIME). DIME connects the systems main memory to the
AGP/PCI chipset.
So, should you be looking for an AGP controller in your server? If your
server is going to be physically running some 3-D applications, it may be
something you want to look at. However, you should know that several published studies question whether there really is a performance increase over
using just a PCI video card. If your system is going to be making use of AGP,
you should definitely add more memory to the server to provide the extra
memory that the video subsystem needs.

Real World Scenario


This is another of those sections where the information provided will help you
understand the makeup of your current server. When you start talking about the
different methods of PCI Bridging, chances are this is not something you are
going to go out and add to a current server. Your high availability, high utilization
servers will have these things configured at the factory.
PCI really went a long way toward removing the mystery of component
installation. Before PCI came along, installing a new component or reconfiguring the system bus was not for the faint-hearted. You needed to have
lots of up-to-date information, and even then, things could get somewhat
confusing if the information was not exactly right. At that point, you had
to do some troubleshooting by trial and error.
Early bus architecture also limited the expandability of systems. Since IRQs
could not be shared and the number of usable IRQs was limited, you were
extremely limited in the number of additional components you could add to a
server. In some cases, you ran out of IRQs before you ran out of slots!

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

194

Chapter 5

System Bus Architecture

The System Bus and System Performance

All the network administrators that I know really hate it when they
hear the complaint, Gosh, the network is slow today!
In this chapter, I have laid out several different technologies that can help
you provide both performance and high-availability solutions. As you saw at
the beginning, the early motherboards bus technology was speed limiting,
not only in processing power but also in moving the information from the
processor back to the user who requested it. PCI helped to change that.
Each of the technologies that we have talked about has stressed the same
philosophy: Take the mundane calculations away from the processor and let
something else handle it. That way, the processor is freed up to do other
things. This, in turn, speeds up performance.
When you design your server, pay close attention to the types of subsystems
that are present, and be sure to take full advantage of them. Also understand
that for each of the performance enhancing technologies that you opt to have
in your server, there are going to be trade-offs. Usually that trade-off will come
in the form of how large a check you will have to write to pay for the server.

Upgrading Peripheral Devices

system peripheral is an add-on component that performs a service not


typically associated with the main server computer. For example, a tape
backup device is considered a peripheral, as is a CD player, a CD writer, a
scanner, a multi-port serial device, and so on. Managing peripherals becomes
a big part of a server admins job when its time to upgrade but after that, provided the peripheral is behaving OK, the device is basically a set-it-and-forgetit thing.
Youll be surprised how often you entertain the notion of upgrading
peripherals. For example, in the Digital Linear Tape (DLT) backup device
category, you mightve inherited a server that was able to put 40 MBytes/
minute of compressed data onto a DLT tape but today you want to replace
it with a tape drive thats capable of 80 MBytes/minute. So a peripheral
upgrade is in order. Or maybe you want to upgrade that old 4x CD-ROM
with a nice new 48x device.
There are some very rare occurrences in which you can upgrade a card in
a peripheral rather than having to upgrade the entire peripheral, but those
times are few and far between.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Upgrading Peripheral Devices

195

Realize too that you can often get a few extra miles out of an older peripheral by simply upgrading its system BIOS. This may not be possible with all
devices, but many of them have the ability to have their firmware updated to
make them compatible with newer operating systems.

Verifying Appropriate System Resources


When considering upgrading a peripheral, you must take certain key physical
elements into consideration. Well use this section to deal with some items that
you must think about (and document on a checklist) before purchasing and
installing a peripheral upgrade.
Expansion slots The first thing to verify is what type of slot the new
device will occupy (ISA, EISA, PCI, etc.) and that you have an open slot
of the appropriate type to receive the new device. This determination will
be meaningful to you when youre considering upgrade slot-based devices
such as RAID array controller cards, SCSI adapters. and things like that.
Some devices require a slot in the computers frame as well as a cable to
a card mounted in the motherboards chassis. A CD-ROM has this kind
of configuration requirement. If you dont already have a card in the computer that can accept the CD-ROMs cabling, then youll have to plan
accordingly.
Having enough slots in the computers frame is also an issue when considering peripheral upgrades. Some computers are filled to the brim with
peripheral devices and can accept no more unless, of course, you take out
an old peripheral and replace it with the new, which is typically what
youll do.
Some servers have quick disconnecting slots that allow you to easily Plug
and Play peripherals in and out of the system. The server chassis designers
are beginning to create very smart admin-oriented designs that allow you
to get a peripheral upgraded in a hurry.
Interrupts One of the bigger problems youll run into is having enough
interrupts (IRQs) to handle the replacement. Again, if youre simply swapping out, you may not have an issue. But what if youre swapping out an
ISA-based backup device with a SCSI-based unit but you find that you dont
have any IRQs available for the new device? Well, then youve got to look
at getting rid of things you dont need so that you can free up an IRQ. For
example, COM2 utilizes an IRQ that you may not need. Go into the
servers BIOS, disable the COM2 port, and youve freed up an IRQ that you

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

196

Chapter 5

System Bus Architecture

can use for your device. In situations where youre not sure about the IRQs,
figure out what IRQs are in use first, then youll know whats available for
the new device.
Direct Memory Access Direct Memory Access (DMA) provides a way
that data can be transferred from a device to system memory or vice-versa
without having to go through the CPU, thus freeing up CPU cycles. You
set up a DMA channel for the data to go through. DMA isnt heavily used,
but it should be used more than it is. When purchasing new peripheral
gear, check the products documentation to see if it can use DMA, then
decide which DMA channel youd like to set up for the device.
Cabling Cabling is a huge issue for external SCSI devices. Youll have to
look at the back of the computer to determine what type of connection the
internal SCSI adapters external port has. Next you determine what kind
of SCSI connection the new device is expecting. Finally you purchase a
cable that matches the configuration. For example, suppose that youre
going to purchase an Ultra-SCSI device but plug it into a SCSI II external
port on the computers SCSI adapter. Youll need a SCSI II-to-Ultra-SCSI
cable. Youll want to make sure which side is male and which is female as
well, before you go looking for the cable. You can buy adapters that fit
onto a SCSI cable to make the cable work with different SCSI versions. I
think youre setting yourself up for data transfer problems if you purchase
an adapter because it could work loose and cause you some problems that
may be hard to diagnose.
Note that you might want to go into the SCSI adapters BIOS to tweak it so
it works with the new device. Check your SCSI adapters documentation
for more information on adjusting BIOS settings.
Power Some peripherals require a power socket and are separately powered from the computer. Be aware of this before you buy so that if youre
lacking enough power sockets where you want to place this peripheral, you
can get the electrical work done before the peripheral comes in. For example, some backup tape devices require a substantial power supply and youll
have to address the power needs before the gear can be put into production.
When your new gear comes in, read its documentation thoroughly and be
sure you understand how to install and configure the device. Lots of times
its easy to get in a hurry and think that you dont need to bother with reading the documentationbut its always worth your while to be sure you read
and understand how the device is supposed to interplay with your system.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Summary

197

Performing an Upgrade Checklist


Be sure to establish some sort of upgrade checklist that you can use to validate
that youve gone through all of the procedures needed to verify that youll have
a successful implementation of the new device. Your checklist should consider
the resource characteristics discussed above as well as updated driver downloading, checking the FAQs and readmes for current information, and other
pertinent items.

Summary

In this chapter we completed the hardware portion of the book. We


have looked at all the various hardware subsystems, from the disk drives,
to the processor, down to the bus. Here we looked at ways to increase the
performance and availability of the PCI subsystem, which is going to be
found in all your servers today.
In the next few chapters we are going to look at how the server can be
actually used. We will begin by looking at some management protocols and
then move into various server roles. So, now that you know how to build
your server, it is time to figure out how to use it!
We talked about the upgrading of peripheral devices and some of the
things that you have to be cautious about relative to an upgrade. You must
make sure you have ample chassis and motherboard slots to house the new
gear. Youll also want to be cognizant of IRQ and DMA settings and to make
sure you have enough power and the right kind of cabling for the new device.

Exam Essentials
Know the basics of PCI bus mastering PCI Bus Mastering is a way for
the motherboard bus to improve performance by directing signals directly
to the components. This is one way of making sure the CPU is involved in
only those transactions that it really has to act on. If the workload of the
CPU is eased, your server should experience better performance.
Know the basics of PCI hot swap or PCI hot plug PCI Hot Swap
means that you can remove a bad component and replace it without shutting off the server. PCI Hot Plug means that you can add a component
without taking the server out of service.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

198

Chapter 5

System Bus Architecture

Know the basics of a hierarchical and peer PCI bus With a hierarchical
PCI Bus, the buses in the hierarchy operate concurrently. That means that
a PCI Master and a PCI target on the same PCI Bus can communicate even
if the other PCI is busy. With a peer to peer PCI Bus there are two independent buses. This means that there can be two Bus Masters transferring
data at the same time, giving more overall throughput and a higher bandwidth. This is especially useful if you have a server with two or more
peripherals that are bandwidth intensive. If you split the peripherals
between the two buses, you are in effect creating Load Balancing.
Know what interrupts are and how the system uses them Interrupts
(IRQs) are the way components get the attention of the CPU.
Know that EISA is a form of system bus; know how the architecture of the
system bus can affect server performance The architecture of the system
bus will determine how much information can flow to various components
at any given time. The faster the bus, with the appropriate components, the
better the performance should be.
Be able to upgrade a variety of devices Know and understand the
complexities and nuances of installing upgraded peripheral devices.

Key Terms
Before you take the exam, be certain you are familiar with the following terms:
Accelerated Graphics Port
address bus
Basic Hot Swap
blocker
bus
bus arbitration
bus controllers
Bus Master
Bus Mastering
bus slave
clock signals

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Summary

data bus
direct memory access (DMA
Direct Memory Execute (DIME)
EISA Configuration
expansion slot
Extended Industry Standard Architecture (EISA)
external bus
Full Hot Swap
Grant (GNT#)
Hardware Device Module (HDM)
Hierarchical PCI Bus
Host-to-PCI Bridge
Hot Plug PCI
Hot Swap
I2O Messaging Layer
Industry Standard Architecture (ISA)
Input/Output (I/O)
Intelligent Input/Output (I2O)
interrupt requestor (IRQ)
interrupts
load balancing
Micro Channel Architecture (MCA)
OS Services Module (OSM)
PCI Bridge
PCI Hot Plug
PCI-to-PCI Bridge
Peer PCI Bus

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

199

200

Chapter 5

System Bus Architecture

Peripheral Component Interconnect (PCI)


Personal System/2 (PS/2)
Request (REQ#)

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

201

Review Questions
1. What was the first computer bus referred to as?
A. EISA
B. MCA
C. ISA
D. I2O
2. EISA was referred to as which of the following?
A. An 8-bit bus
B. A 12-bit bus
C. A 16-bit bus
D. A 24-bit bus
E. A 32-bit bus
3. If you want to do PCI load balancing, what will you need to have?
A. A Peer Bus
B. A Hierarchical Bus
C. Hot Swap devices
D. Hot Plug devices
4. Which version of Hot Swap PCI requires users to notify the operating

system that they are about to take a device out of the system?
A. Basic Hot Swap
B. Full Hot Swap
C. High availability
D. All of the above

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

202

Chapter 5

System Bus Architecture

5. How many devices can be part of the original PCI Bus?


A. Five
B. Four
C. Three
D. Two
E. One
6. If you are using PCI load balancing and you want to install two network

interface cards and a single drive array controller, how would you plan
to install them in a Dual Peer Bus configuration?
A. Put both of the NICs on one bus and put the drive array controller

on the other.
B. Put both NICs on the master PCI Bus and the drive array controller

on the slave PCI Bus.


C. Put a NIC on each of the peers, and add the drive array controller

to either one.
D. There cannot be more than one NIC in any server.
7. What is Bus Mastering?
A. All transactions are sent directly to the processor.
B. All transactions are sent directly to memory.
C. All transactions directed to the disk array controller are directed to

the PCI controller.


D. The microprocessor does not have to be involved in every transaction.
8. AGP is a type of which of the following technologies?
A. Disk Array
B. Video
C. PCI Hot Swap
D. PCI Hot Plug

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

203

9. AGP makes use of what type of memory?


A. L1 cache
B. L2 cache
C. Main memory
D. High memory
10. With a PCI Bus, how are interrupts configured?
A. Software
B. Jumper
C. DIP switch
D. Interrupts are hard coded to the expansion slot.
11. What is I2O designed to do?
A. Make use of main memory
B. Make use of L2 cache
C. Provide a dedicated I/O processor
D. Provide a way of accessing load balanced PCI devices
12. Johann is replacing his old 8mm tape drive with a new five-cartridge

DLT tape changer. Both devices are SCSI. Now the computer wont
boot to the NOS and Johann is getting a SCSI IRQ conflict error even
though he verified that hes using the same IRQ as the old backup
device. What could be the problem?
A. New device isnt terminated.
B. Device is trying to use six IRQs.
C. PCI bus is autodetecting the wrong IRQ.
D. New devices BIOS hasnt been enabled.

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

204

Chapter 5

System Bus Architecture

13. What is an advantage of Hot Plug over Hot Swap?


A. Hot Plug can use regular PCI cards.
B. Hot Plug PCI cards are cheaper than Hot Swap cards.
C. There are no differences.
D. A, B
14. Ahmed has a server to which hes adding a second PCI NIC card in

order to dual-home it. The card went in OK and didnt seem to


complain but now the network connectivity is acting erratic. What
could be the problem?
A. I/O address conflict
B. IRQ conflict
C. DMA address conflict
D. PCI bus disallowing NICs IRQ
15. What are the two types of PCI Bridges?
A. PCI-to-CPU
B. PCI-to-PCI Bridge
C. Host-to-PCI Bridge
D. PCI-to-Host Bridge
16. If PCI Bridges are connected to other PCI Bridges, what is the maxi-

mum number of PCI Buses?


A. 512
B. 256
C. 128
D. 64
E. 32

Copyright 2001 SYBEX, Inc., Alameda, CA

www.sybex.com

Review Questions

205