0% found this document useful (0 votes)

1K views830 pages

Visual Basic 6 Client-Server Programming Gold Book

Uploaded by

Rishi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views830 pages

Visual Basic 6 Client-Server Programming Gold Book

Uploaded by

Rishi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Introduction
-----------

What's On The CD-ROM About the Authors Part IBasics Of Client/Server Development With Visual Basic 6 Chapter 1An Introduction To Client/Server And Networks
A History Of Files What The Heck Is Client/Server? Designing The Client/Server System Network Topologies And Architectures Network Protocols Network Operating Systems Where To Go From Here

Chapter 2Relational Database Management Systems

Relational Databases Vs. Relational Database Management Systems Considerations In Selecting An RDBMS A Survey Of Available RDBMSs Database Organization

Database Design Data Definition Language Data Control Language Where To Go From Here

Chapter 3An Introduction To SQL Data Manipulation Language

What Is SQL? A Note About ODBC Data Manipulation Language (DML) Where To Go From Here

Chapter 4Visual Basic 6 Data Access

Visual Basic Data Access Trends Whats Behind Door Number One? SoWhat Flavor Tastes Best? Where To Go From Here

Part IIVisual Basic 6 Database Programming Chapter 5Data Access Objects (DAO)
DAO Object Models DAO Objects The Data Control Where To Go From Here

Chapter 6Remote Data Objects (RDO)

What Is RDO? Overview Of Remote Data Objects Exploring Remote Data Objects The Remote Data Control Bonus: The RDO Ad Hoc Report Writer Where To Go From Here

Chapter 7Introducing ADO And OLE DB

Microsoft Data Access Components ADO Overview Using ADO Objects

The Active Data Control Where To Go From Here

Chapter 8Converting To ADO

ADO Compared To DAO And RDO Converting The Application Where To Go From Here

Chapter 9Advanced ADO Client/Server Techniques

The Data View Window Using The DataEnvironment Object Using The DataReport Object Command And Recordset Hierarchies Other Data Access Tools In Visual Basic 6 Where To Go From Here

Chapter 10Creating Business Objects With Visual Basic 6

Introducing The Business Object The Business Object Relocating The Business Object Where To Go From Here

Chapter 11Visual Basic 6 Advanced Database Topics

Data Validation Keeping Common Data In Memory Stored Procedures And Triggers Generating Primary Keys Result Set Size The Nature Of Transactions Where To Go From Here

Part IIIVisual Basic 6 And The Internet Chapter 12The ABCs Of XML
Regaining Context With XML

Leveraging The MS XML API Describing Your Data: The DTD Building An XML Application Where To Go From Here

Chapter 13Serving Up The Web

Serving With Distinction: A History Of CGI The Problem With Scripting Beyond The Canon Linking Events Getting Browser Capabilities Where To Go From Here

Chapter 14The Dynamic Client

The Role Of The Client The Dynamic HTML Application Exploring The Internet Explorer Object Model Building Tables Understanding Input Where To Go From Here

Chapter 15Power Tools

Image Handling Doing It With Style Maintaining A Dialog Box Accessing ActiveX And Applets The Future Of Internet Programming Summary Bibliography

Part IVAppendixes Appendix ACreating The Sample Database Appendix BDifferences Between Jet SQL And ANSI SQL Appendix CODBC Functions

Index
Products | Contact Us | About Us | Privacy | Ad Info | Home Use of this site is subject to certain Terms & Conditions, Copyright 1996-2000 EarthWeb Inc. All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Table of Contents

-----------

Introduction I have a confession to make. As recently as a year ago, Visual Basic was not
my preferred tool to develop client/server applications. I have been developing database applications for two decades and have used many languages and tools. While I will never claim to know it all, I have a good sense for what works in the real world and what doesnt work. As far as client/server development is concerned, Visual Basic 5 seemed to me to be right on the edge as far as being a viable tool for large applications. Visual Basic 6 has changed my attitude. The client/server developer needs a development tool, not a language. The tool needs to be robust, needs to support a myriad of different backends (data sources) and needs to be able to produce a variety of different types of applications from single-user desktop solutions to multi-tiered applications deployed on the Internet. Visual Basic 6 is a superb tool adaptable to those types of projects and more. VB 6 is not best of breed in all categories. Weaknesses remain in areas such as object-orientation and data modeling. The tools that are there sometimes lack a little in how well they are integrated. Even still, VB6 is todays best choice for rapid application development of data driven applications in a client/server environment. In this book, my co-author Kurt Cagle and I have gleaned what works and what doesnt from months of work with VB6 betas using a variety of backends and application platforms. Between us, we have almost four decades of real-world experience. We have put together a book that is neither a rehash of the Help files nor a pie-in-the-sky Microsoft marketing brochure. We have assembled a guide to the development of client/server applications that will scale as you need and that will comfortably accommodate the rapid changes in

technology. We have kept our eye to the target audience, the experienced Visual Basic developer, while acknowledging that experienced means many different things in many different environments. We do not insult the readers intelligence with blow-by-blow details of using the Application Wizard, for instance. But, we also do not assume that the reader has previously built large-scale, multi-tiered, client/server systems. In fact, very few VB developers have built such applications. So, we lay the groundwork as we incrementally build the core knowledge from chapter to chapter. Chapters 1 through 3 concentrate on the environment within which the Visual Basic client/server application will be running. We discuss the whys and therefores of client/server in terms of the network and the database. We discuss and summarize network issues and database platform issues. We show you how to intelligently design a database and how to use SQL. In Chapter 4, we cover the gamut of Visual Basic 6 database access techniques, including DAO, RDO, ODBC, VBSQL, and ADO. We offer specific advice on when to use what as well as our opinions on the relative merits (and de-merits) of each approach. Chapters 5 and 6 cover development using DAO and RDO respectively. While most developers will eventually move to ADO, both DAO and RDO will be around for some time to come. Chapter 7 introduces ADO and OLE DB. We then show you how to use these new tools, taking the time to compare and contrast with the more familiar DAO and RDO models. In Chapter 8 we guide you through the conversion of existing DAO and RDO projects to ADO. We use as scenarios example projects built in Chapters 5 and 6 and provide detailed step-by-step guidance. Again, we provide this guidance in a real-mode manner, resorting to brute-force techniques when such techniques work best. Chapter 9 gets into more advanced and efficient techniques using ADO and OLE DB. We show you some of the newer tools with detailed examples. We discuss the DataEnvironment object, the DataRepeater object, the Format object, and so on. We make no bones where things are a little rough around the edges, showing you how to get around the flaws in hierarchical data presentations and similar gotchas. Chapter 10 is where we start hammering away at solid object-oriented development techniques instead of just paying lip service to them. We detail the advantages and disadvantages of different approaches to object-oriented development using classes. We discuss the efficiencies and inefficiencies in different binding techniques. From there, we develop business objects from classes that act as data providers to different objects, optimizing each for the circumstances under which they are deployed. We show you critical gotchas when moving the local business object to a remote platform and, frankly, show you how to plain make it work. Chapter 11 takes advantage of the knowledge assembled in the first ten chapters to explore advanced database techniques, including the creation and use of stored procedures and triggers. We deliberately placed this chapter at this point in the book both because these techniques help to solidify the traditional client/server application and because the techniques are absolutely

critical to the success of Web-based applications, which we discuss in the remainder of the book. Because so much is said about stored procedures and triggers, and because so few shops actually make effective use of them, we explore unusual uses of them both in terms of data validation and in the creation of a self-referencing data dictionary. And that brings us to the last third of the book: VB6 and the Internet. While client/server applications are, by definition, network applications, the Internet is the mother of all networks. The Internet connects 79 million people in the United States alone. With servers in excess of 10 million and Web pages in excess of 300 million, the Internet only continues to accelerate its growth. The astonishing thing is each of those 79 million plus people are all connected to one another and to many millions more around the globe, each one redefining what we mean by client/server programming. As cable modems and DSL lines begin to replace 28Kbps and 56Kbps analog modems, the number of people and uses for the Internet will jump astronomically as will the expectations of users. Client/server programming is all about getting data to and from the user. Visual Basic 6 dramatically redefines the boundaries between client and server, between data and user, offering a combination of one of the best RAD development environments on the planet with all the power of Active Server Pages and Dynamic HTML. In essence, with Visual Basic 6 you can create sophisticated Web applications targeted to the widest possible audience using the same tools that you have already mastered for other client/server development. The final four chapters of this book focus on this new technology, with an in-depth look at Internet Information Services applications and Dynamic HTML applications. Additionally, one of the hottest topics in data communications, Extensible Markup Language (also known as XML), is explored in detail, showing how you can take advantage of this new data standard in your own programs. Finally, this book looks at several useful technologies for the Internet-savvy database engineer, including remote component servers, client- and server-side scriptlets, data persistence, and more. There is no right way or wrong way to use this book. (Well, using it as a door stop or as something to raise your monitor probably isnt the best use we could imagine.) The experienced client/server developer may skim or skip over the first three chapters. Those readers new to client/server may wish to spend extra time on those chapters. Those readers that are making a commitment to ADO should at least read Chapters 7 and 9 before skipping ahead to the fun stuff (Web development). Those looking to check out the viability of Web-based development may well want to skip ahead to Chapter 12 before coming back to other portions of the book. We have tried to include at the front of each chapter some keywords to highlight topics to be covered and have tried to include at the end of each chapter some suggestions for where you may want to go next. Also be sure to check out the code samples on the enclosed CD-ROM. Several

people checked each piece of code. Although we have strived to make sure that every application works properly, the nature of the material is such that it was not possible to test on every single combination of platforms. As an example, you may need to make minor alterations in SQL syntax depending on what database you are connected to. Please pay particular attention to the fact that you will need to do some setup on your own system. Many of the examples use an ODBC data sourceafter creating the database (we have included the SQL statements to do so on the CD-ROM), you will need to set up the ODBC data source in the Control Panel. The paths to the data in some of the examples may need to be changed to reflect your paths. Also, pay particular attention to the fact that some samples connect to server applications. Since these sever applications need to be registered (see Chapter 10 in particular), you will need to compile them, go to the References dialog box, and reselect the references. Otherwise, you will get mysterious error messages. For the Internet chapters, please note that many of the examples were done using Internet Explorer version 5. You may wish to download that from the Microsoft Web site. We would have included it on the CD-ROM but it was in beta as this book was going to press and a more current release will surely be posted before you read this. You can use Access for most of the examples from this book if you want to experiment at home or otherwise do not have access to a relational database. You can also download trial versions of a number of different good single-user database engines. Sybase SQL Anywhere (renamed Sybase Adaptive Server Anywhere as this book was going to press) is an easy-to-set-up and use product available from www.sybase.com/products/anywhere/index.html for a free 60-day trial. If you like the book, please buy a couple dozen more. You can give them to your kids, your spouse, your mother (she will be impressed and thankful) and your boss (he or she will also be impressed and will give you a raise). Coriolis always invites comments, suggestions, criticisms, and so on. Visit their Web page at www.coriolis.com to find contact information or to check out any updates to the book. Thank you, Michael MacDonald ([email protected]) Kurt Cagle ([email protected])

Table of Contents

Products | Contact Us | About Us | Privacy | Ad Info | Home Use of this site is subject to certain Terms & Conditions, Copyright 1996-2000 EarthWeb Inc. All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Table of Contents

-----------

Whats On The CD-ROM

The enclosed CD-ROM contains all of the source code examples from this book, plus many bonus examples. You will find them organized by chapter. Please remember to make any changes needed to reflect your own environment. See the readme file on the CD-ROM for more information. Also included on the CD-ROM is a collection of utilities that I believe you will find helpful. Please note that this book was written while Visual Basic 6 was being beta tested; therefore some of these utilities may have been updated to take advantage of VB6 after the book went to press. Advantageware VB AdvantageA design-time add-in that gives you the power to quickly and easily customize your design environment with a broad range of powerful tools. (Trial version) Apex Software: True DBGrid Pro, True DBInput, True DBList Pro, True DBWizardEnhanced controls to help speed up your coding. Bokler Software HashcypherAn ActiveX control that allows you to easily provide robust data encryption using the Secure Hash Algorithm (SHA-1). (Trial version) Caladonia Systems Code Print Pro Allows you to generate formatted, professional documentation that spans multiple projects or specific sets of routines. (Trial version) Catalyst Socket ToolsA collection of TCP/IP networking components and libraries for Windows that assist you in developing Internet and intranet applications. (Trial version) CoffeeCup HTML EditorA top-rated, WYSIWYG HTML editor with many new and cool features to create and deploy web pages with attitude. (Trial version) DeltaPoint QuickSiteThe most productive way to create, manage,

and sell through your Web site. (Trial version) Suprasoft Crystal Reports Crystal Design ComponentA custom component that encapsulates the previewing of Crystal Reports reports inside of an easy-to-use control. This is an efficient and productive building block to create a unique and customized preview application or front end for your software. (Limited version)

Requirements:
Minimum of 486 or equivalent processor 16MB RAM Windows 95/98/NT Visual Basic 6

Table of Contents

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Table of Contents

-----------

About the Authors

Author of seven books and dozens of magazine articles, Michael MacDonald has nearly 20 years of experience in software development covering all major platforms. An independent consultant specializing in client/server and databases, Michael is also an instructor in the client/server certificate program for Worcester Polytechnic Institute and a courseware developer. Kurt Cagle is an Internet developer and multimedia specialist with 18 years of experience in the field, concentrating on client-side and interface issues. He is the author of 3 books and more than 50 magazine articles on many aspects of the computer revolution, and has done work for Microsoft, AT&T, RealNetworks, Starwave, and many other major computer and technology-related corporations.

Acknowledgements
This is the area where I always get to thank everyone who so much as returned a phone call. But, there were some very special people behind this book. I am especially indebted and grateful to the publisher itself, Coriolis Group Books, for allowing the extra time to produce a book that was solid instead of a book that was early to market. It says a lot about the integrity of an organization that puts quality ahead of profits. I want to thank the Acquisitions Editor for this project, Stephanie Wall, for giving me the opportunity to do this book and for a lot of cheeriness and professionalism. Toni Zuccarini was the project editor, which is a job that does not get near enough thanks or recognition. Toni coordinated a myriad of details with grace and aplomb. John Lueders was the technical editor and painfully (to me as well as to him I am sure) reviewed every line of code in the book and on the CD-ROM, offering advice and criticisms, and did it well. Kristine Simmons was the copy editor for the project. Copy editors are those people who review manuscripts for clarity and

grammatical correctness. Kris offered many fine suggestions and untangled sentences and thoughts far better than I ever could have. I have worked with many copy editors and (I mean this in all sincerity) Kris is probably the best. I have to acknowledge the contributions of my co-author, Kurt Cagle, who stepped up to bat to handle Chapters 12 through 15 and really did a fantastic job. There are many people who go into the creation of a book who work behind the scenes without much in the way of recognition: Alan Pratt is the Technology Director and keeps, well, technical things moving between Coriolis and authors; Robert Clarfield coordinates a blizzard of details in getting that CD-ROM into the back of this book; Wendy Littley is the Production Coordinator and is in charge of getting the text laid out, getting proofs, coordinating among a zillion people, and so on; Tony Stock designed the cover of this book (and you thought I did that myself!) invisibly but professionally; and there are many others. To all of themthanks. I need a vacation!

Dedication
As always, to my familymy wife, Patricia, and my children, Amanda and Peterwho are the ones that truly make the sacrifices when I work past midnight. Also, to some very special colleagues who it has been my privilege to work with for more than a decade, including Larry Altrich, Steve Remmes, and Randy Vance. And finally, to a person who took a neophyte under his wing almost 20 years ago, Dr. Thomas Gross. Michael MacDonald To my parents, who taught me that humor and compassion will defeat arrogance and stupidity every time. Kurt Cagle

Table of Contents

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Part I Basics Of Client/Server Development With Visual Basic 6 Chapter 1 An Introduction To Client/Server And Networks
Key Topics: How we arrived at the relational data model Distributed computing The definition of client/server 2-tiered, 3-tiered, and n-tiered architectures The Microsoft Services model COM/DCOM and the Component model Application partitioning Network topologies and architectures

The migration to client/server development has its roots in the nature of

networks, which excel in the sharing of data and devices and the cooperative processing of multiple clients, but are hampered by problems endemic to the limited amount of data that can be transmitted at the same time. Client/server solutions seek to maximize the strengths of networks while limiting their weaknesses.

In this chapter, I introduce the underpinnings of client/server and how it evolved from the distributed data model of the 1970s and 1980s to the multi-tiered model of today. For some readers, this material is old hat; skim it or move on to the next chapter. For others, however, the material provides real-world background upon which to set in context the remainder of the book. The material is as applicable to development with any language as it is to development with Visual Basic (VB). Organizationally, this chapter is far ranging in the topics covered. I discuss the evolution of file formats and how they led to the relational model of today and the object model of tomorrow, as well as how these files evolved to support the distributed data format that was a direct ancestor of client/server today. I define basic and advanced client/server architectures, and I conclude with a review of the network topologies and architectures upon which client/server is based.

A History Of Files
Since the inception of the computer, there have really been only three file types: sequential, navigational, and relational. We are now seeing the introduction of a fourth type, the object model, along with relational/object hybrids. Concomitant with this evolution has been a shift of responsibility from the application to the database. For instance, in the sequential model, the application was responsible for the location of any given record as well as for the enforcement of all business rules. In the following sections, I review the evolution of files; in other sections of the chapter, I point out how this evolution mirrors the move to client/server development.

Sequential Model
Sequential files consist of fixed-length records, each of which typically contains fixed-length fields. The 80-column cards of old were essentially sequential files as were (and are) files stored on magnetic tape. A variation on the sequential model, text files, evolved when computer languages evolved to the point where they could use a character (such as a carriage return) as a record delimiter instead of depending on the end of a record being in a fixed location. Files such as AUTOEXEC.BAT and CONFIG.SYS are text files. Visual Basic supports both forms of sequential files, using the Open keyword to open the file, Close to close the file, and a variety of keywords to read from and write to the file depending on the nature of the files use. I review these in Chapter 4. The key advantages of the sequential file model are simplicity of programming file access and the speed with which an entire file can be processed. On the other hand, it is not possible with sequential files to locate any one record without first reading all other records before it. For instance, to pull up John Smiths customer record, it is necessary for the program to read the first record in the file and determine whether it is the correct one. If it is not, the program must read the next record and so on until the desired customer is found. As a result, the sequential file model also does not support any type of ad hoc

reporting capabilities. The application program is responsible for all file I/O, for all record searches and updates, and for all data integrity measures (such as ensuring that an order has a valid customer). Some improvements were realized by using random access techniques. In theory, if your application knew that John Smiths record was the 318th record in the file, the program could move directly to that record without reading the first 317 records. The caveat, of course, is that the application was then responsible for creating and maintaining some sort of index into the filenot an easy piece of logic to code. Programming languages had to evolve to support this as Visual Basic does when files are opened in Random mode. Random access requires that every record be fixed length. A variation is Binary mode where, instead of accessing a record by specifying its record number, the program uses a byte offset from the beginning of the file.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Navigational Model
The navigational file modelsometimes called the hierarchical modelwas a major improvement. The term stems from the fact that it was usually necessary to navigate to a desired record by first accessing other records. Often, the details of this navigation were hidden from the developer so that the database did the dirty work of locating a specific record. Common examples were dBase and Btrieve files. On the mainframe, the navigation model was implemented with Virtual Sequential Access Method (VSAM). Visual Basic supports the navigational model in two ways. Using a file opened in random access mode, you can manually maintain some sort of hashing algorithm to navigate from record to record. Often, this method is implemented in a manner similar to what C programmers know as linked lists. A linked list essentially maintains a series of pointers into a structure or file allowing navigation by (for example) account number or customer name. Visual Basic also allows connection to navigational databases via the Microsoft Jet engine. Although the interface provided is SQL, the underlying file is still accessed navigationally. Microsoft refers to this as Indexed Sequential Access Method (ISAM).

Relational Model
The relational model organizes data into related tables. A table is a two-dimensional grid of columns and rows where each row is a record and each column is a field in that record. Each table has a unique identifier, called a primary key, which allows the retrieval of any one row without referring to any other row. An example of a primary key might be customer number. Tables are related by common information. An Orders table would be related to a Customer table via a common customer number column: Each table has a column containing customer number. The relationship is enforced via a foreign key. The foreign key is a special type of index on the database that

enforces a business rule such as no order may exist without a valid customer number. This enforcement is called referential integrity. The database typically sports other methods of maintaining data integrity, such as restricting the value of gender to male or female. These enforced rules are called constraints. There are a number of vendors of relational database management systems (RDBMSs), including Oracle, Informix, Sybase, Microsoft, IBM, and so on. All of the RDBMS engines use SQL as a language for accessing and manipulating data. The majority of this book is devoted to using Visual Basic in an environment with an RDBMS such as Oracle, including techniques for maximizing performance of the database as well as maximizing developer productivity. The next two chapters introduce RDBMSs and the use of SQL.

Distributed Computing
To understand the move toward client/server, it is necessary to digress briefly to the problems inherent in the centralized processing models of the mainframes of the 1960s and 1970s. All data, along with the application programs written to manipulate that data, resided within the confines of a centralized computer. For a company with a single office in Boston or Chicago, this arrangement was fine. However, for a company with offices around the country or around the globe, this created practical problems: If the home office (and the computer system) was in Chicago, office workers in Boston and Los Angeles had no practical access to data relevant to their operations. The concept of the Distributed Computing Environment (DCE) sought to mitigate this problem by moving data closer to users on multiple machines. This solution had the added benefit of spreading processing loads to multiple machines. On the other hand, it was difficult to update information that was spread over multiple computers. It was also difficult to merge the data back together for reporting. As networked computers evolved, so did the proliferation of DCE. Networks were (and are) a natural platform on which to distribute data. Contrary to popular belief, networks have been around almost as long as computers. In 1964, the United States government contracted with the Rand Corporation to design a network that could continue communicating even if a portion of the network was destroyed in a disaster. The result, ARPANET, came online in 1968 and eventually evolved to what is known today as the Internet. Networks stretched the definition of DCE by adding the concept of sharing resources, such as printers and other expensive hardware. As the expense of networks decreased and capabilities increased throughout the late 1980s, the network exploded into prominence in corporate America. With the increased use of networks, sharing data on file servers was a natural happenstance. This development, however, revealed the weakness of networks: The bottleneck on nearly any network was and is the low bandwidththe amount of data that can be moved over the cabling. As an

example, a high-speed Ethernet network can theoretically move data at 100Mbps (megabits per second). In reality, the wire (the network cabling) is shared by many users, so even though the theoretical throughput sounds high, it dwindles as more users are added to the network. Worse, the wire can only send one piece of data at a time. Therefore, as more users are added, the potential for two users attempting to send data at the same time increases. When this happens, a data collision results. Any time there is a collision, the data has to be retransmitted. Thus, as network utilization increases, performance degrades very rapidly. Client/server solutions seek to take performance issues into account by minimizing the amount of data being moved.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

What The Heck Is Client/Server?

Client/server is a term much bandied, sometimes by people who dont have a clue about what it means. In fact, many industry pundits will stand on a podium and tell you that there is no single definition of client/server. Nonsense. Client/server computing simply means that two (or more) processes run independently in a cooperative manner. The simplest and most common example is the classic two-tiered database application shown in Figure 1.1. Here we see a client program communicating with a database engine running on a remote server, and the two are connected via a network. The client is responsible for providing an interface to the user. Typically, the client will create an SQL request for data and send that request to the database. The database then evaluates the request, fulfills it, and sends the data back to the client.

Figure 1.1 In a two-tiered, client/server application, the client requests records from the database server, which processes the request and returns only those records that meet the criterion. Contrast this setup with just a shared file on a server. (I discuss this concept further in Chapter 2 where I define the term RDBMS engine. With a shared file, there is no separate processprogramrunning. The file, perhaps an Access database, is merely shared by multiple users.) Lets assume that we have 10,000 customers for whom we store information. We want to find all customers who live in California. Suppose 50 customers are from the Golden State. With a shared file, the program has to read the entire file and determine

which customers are from the state of California. To do that, all 10,000 records are sent from the server to the client. The user is going to wait a very long time for a response. In a client/server environment, the client creates a SQL statement such as Select * From Customer Where State = CA. The request is sent to the database server. The server then evaluates the request and finds the 50 records before sending any data back to the client. Because only 50 records are sent to the client (instead of 10,000), network traffic is reduced dramatically. The simplistic model that I have just illustrated works fairly well when the number of simultaneous userssuch as in a departmental applicationis not too high. However, it begins to sag when we try to scale (grow) it to the enterprise level. In the sections that follow, I address this issue as well.

Multitiered Architectures
With the terms two-tiered, three-tiered, or even n-tiered architectures, we are referring to the number of layers of communications that make up an application. Each term represents a division of duties. If you write an application program that communicates with a database server, you construct a two-tiered application. The database is primarily responsible for returning data to the client. The client is primarily responsible for providing the user interface and the business logic. This structure was shown in Figure 1.1. There are a number of problems with this approach. Although a goal of client/server is to minimize network traffic, the fact remains that if you add enough users, the network will still be crushed. For instance, the database itself can support only so many active, simultaneous connections before it begins to sag performance-wise, simply due to a lack of resources (memory and so on). Also, you tend to reach a point where the network traffic still becomes too high. If your multiuser FoxPro system could support 5 or 10 users, the two-tiered client/server equivalent could probably support 50 or 100 (because the client/server approach puts much less strain on network resources than does the shared-file approach used in a FoxPro application). Beyond those 50 to 100 users, however, the network still cant handle the amount of data being moved around. Other complications come from the sheer complexity of maintaining that many clients. Assume you do manage to get 1,000 users on your client/server application. Each time the application needs to be maintained, all 1,000 PCs must be updated. That is a Herculean task for which no good administrative tools exist. Further, todays graphic-heavy environments and intensive computations require ever bigger and faster PCs, creating an endless cycle of expensive PC upgrades and replacements. Therefore, the obvious answer is to offload as much processing as possible to an application server. In this scenario, the client application is mainly concerned with presenting a user interface and does as little business logic as possible. The business logic is moved to a server that performs business-oriented processing. The application server manages connections to the database and performs all data requests of the database. The database also performs a limited amount of logic by

enforcing business rules on the server. This three-tiered model is illustrated in Figure 1.2.

Figure 1.2 In this three-tiered system, the client connects to the application server, which in turn connects to the database server.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

A more familiar example of a three-tiered application might be what you see on the Internet. Visit a site such as www.amazon.com (an online bookstore) and search for books with Visual Basic in their titles. Your PC is the client in this model with the Web browser containing the user interface. It connects to a Web server at Amazons site. The Web servers main job is to create Hypertext Markup Language (HTML) pages to be sent to the Web browser on the client. Extensions on the Web server perform some business logic and interact with a database server. The Web server sends a request to the database server asking for all titles that match your search criterion. The database server processes the request and sends the result back to the Web server. The Web server then formats the result set into an HTML page and sends that back to the client for presentation within the Web browser. Beyond overcoming practical limitations imposed by the two-tiered design, a multi-tiered design offers other advantages that exploit the location and nature of each physical tier. Figure 1.3 shows a group of PCs networked together with the application server on the same ring. Notice that the client PCs can communicate with the application server, but only the application server communicates with the database server. This arrangement has the effect of reducing a certain amount of data traffic on the network because the database server isnt even on the network. Thus, PCs on the network that are not part of the client/server application itself suffer minimal impact on network performance.

Figure 1.3 A three-tiered configuration with the client computers communicating with the application server via the network.

The Services Model

Visual Basic supports an approach to client/server in three-tiered and n-tiered designs known as partitioning. The application is broken into three key service areas, which typically partition the application into three discrete tiers. Together, these tiers are known as the services model: User servicesEssentially the way the application interacts with the user. Business servicesThe processing necessary to accomplish business goals, such as editing a customer record or decrementing inventory levels. Data servicesThe maintenance of the database itself, such as updating tables, transaction management, and concurrency support (row and table locking). Each of these areas is encapsulated into a tier of the client/server system. The client handles the user interface, the application server handles the business logic, and the database server performs data-handling chores. I expand upon these concepts throughout the book.

The Component Model

With Visual Basic, the component model extends into the client/server environment. Consider the word component. Visual Basic supports the creation and use of objects developed with Visual Basic, Visual C++, and many other development tools by independent teams of developers. These objects have functionality embedded in them that can be used repeatedly in many different projects. As such, the objects become components in what is called component-based development. Rather than re-create a routine to calculate the sum of all line orders on an invoice, for example, we create one routine as an object that we can then use almost as a building block in other applications. Consider, as a simple example, the Common Dialog control that you add to your application when you need a File Open or Printer Setup dialog; rather than re-create it yourself every time you need that functionality, you have a prebuilt component that you can reuse from project to project. When we look at a client/server model as a group of services (user, business, and data), then each service is a collection of assembled components. Assume you use Visual Basic 6 to build an ActiveX object called custUpdate. The object is nongraphical, consisting only of code that updates customer records. Encapsulated within this object is logic that enforces business rules such as a customer may not exceed $500 in credit and a customer must have a date of birth less than todays date. Your custUpdate object is a component that can be deployed at the client level with no special coding because of the benefits of Microsofts Component Object Model (COM) technology. Any tool, platform, or object that supports

COM automatically knows how to communicate with the object and can take advantage of its methods and properties.

More On Business Services

Client and data services tend to be easy to define, but the definition of the term business services can be somewhat muddled. For efficiency of design and practicality of deployment, anything that is not a client service or a data service is usually lumped into the business services bucket. The user interface clearly belongs on the individual user PC. This includes forms, controls, screen navigation, and so on. Anything that performs physical data retrieval or maintenancethat is, the database enginebelongs in the data services area. Other items that can be shared by multiple clients (or the data services partition) then become a part of the business services partition. An object deployed to handle multiple print chores (reports, pagination, and so on) is not, strictly speaking, a business service. However, it is not a good candidate to be deployed on each client nor should it be placed on the database server. The most convenient place to put it is on the application server as part of the business services partition.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Distributed Component Object Model (DCOM) technology extends this concept by allowing a component to be placed anywhere on a network. Any other object can communicate with it without knowing or caring where it is on the network. Assume you have a client/server application that uses the Internet as its network. Your Visual Basic application can invoke methods of custUpdate (without any change in coding) regardless of whether it is located on your computer or on a server on the other side of the world. You can take advantage of DCOM by deploying all your business logic on an application server. Instead of having a custUpdate object on every client in the network, the object is located only at the server. The object operates independently of any client and independently of the database server. Should a business rule change (perhaps you will allow a customer credit of up to $1,000 instead of $500), you have only one place to make the change and that change is instantly made available and enforced for all clients. By placing all your business logic on an application server, you have effectively partitioned the business services from the user services (and from the data services).

The Thin Client

One of the buzzwords of the day is thin client (okay, thats two words). In the two-tiered client/server model, all of the user services and all of the business services are located on the client, whereas the database server handles the data services. This arrangement places a strain on the client because the programs tend to be big and slow and contribute to the need to continually upgrade PCs across an organization. Unfortunately, as we (client/server developers) write programs that are increasingly graphical in nature, we contribute to this bloat. By moving the business services to an application server, the client need do

little more than present forms, menus, and the like to the user. All of a sudden, we are actually taking some of the strain off the client and fewer resources (such as memory, disk space, and CPU) are needed. The direction we are heading, then, is towards a thin client. Visual Basic 5 introduced the capability to create applications that actually run within a Web browser. No form or menu runs on the client; it is housed in the browser. The browser then becomes the user interface. If the user wants to access a client/server order entry system, he or she launches Internet Explorer (or Netscape Communicator). If he or she wants to access the accounting client/server application, he or she launches Internet Explorer. If the PC is powerful enough to run Internet Explorer, it is powerful enough to run any of the client/server applications. This method is how you implement a thin client. It means that the application is truly network independent. If the network supports TCP/IP, it supports your application. It also means that your application is operating system independent. If the client has a Web browser that supports active clients, it can run your application. Your Visual Basic application system is portable over multiple operating systems. Lest I sound like a walking billboard advertisement for Microsoft, let me be the first to say that the technology is not perfect. Can you do everything I just said in the preceding paragraph? Yes, sort of. You may be constrained by network limitations. (The Internet, for instance, has been known to get bogged down from time to time and that is beyond your control.) The range of database connectivity options and methodologies (a major subject in this book) is confusing to say the least. As the technology matures, it is probably wise to tread lightly and test carefully. Still, it is an exciting trend, and it will ultimately simplify development enormously.

Designing The Client/Server System

One of the keys to implementing Microsofts DCOM in a client/server environment is proper identification of both service requirements and their logical location in the physical implementation. Put in simpler terms, what does the application have to do and where should it be done? Part of the process of ferreting out this information is traditional system analysis, part is the application of well-defined object-oriented principals, and the final part is the somewhat less clear process of properly partitioning the results.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Systems Analysis
Assume that we are building a traditional order entry system. The first step is to gather detailed requirements from users. We can use a number of well-established, formal methodologies (on which whole books have been written). Those experienced in the field often use a somewhat less formal approach. In this day and age of rapid application development (RAD), many companies turn to joint application development (JAD) brainstorming sessions. In these JAD meetings, users and technicians meet to hammer out first the functionality of a system and, normally, the data elements to be captured. The latter form the basis for the database design, which I review in Chapter 2. For instance, in an order entry system, typical functions will include the following: Add, update, and delete customer. Add, update, and delete order. It is useful to compare the functions to the database design in order to verify that each database entity has at least one occurrence of add, update, and delete (to ensure that there is a mechanism to add a customer). It is often helpful to bring senior management to at least the initial JAD meeting to clarify the organizations goals. It is not uncommon to encounter a sizable disjoint between what senior management sees as the organizations philosophy and what the rank and file sees. Management can provide a mission statement from which all functionality is derived: Sell widgets worldwide. The mission statement then can be viewed as the highest-level function of the organization. That function can be decomposed into smaller units that define how the overall mission is accomplished (see Figure 1.4).

Figure 1.4 The beginning of a simplified functional decomposition. Each function is decomposed further until an atomic-level function statement is achieved. A function that has been decomposed to the atomic level generally provides one and only one output. An example of an output is a report or a table being updated. (In practical terms, you may then opt to combine two functions to make the application more functional. For instance, it is common to maintain order headers and associated line items on the same form.) As you can see, you are not defining how the application will accomplish a task. Instead, at this stage, you are merely concerned with what the task is. Because of space constraints in this book, I cannot explain all of the ins and outs of the different approaches to systems analysis and design. I spend a good 40 hours teaching these skills to students in what is still a compressed format. However, I end up stressing that the whole process of systems design comes down to practicing common sense. Decomposing functionality is particularly troublesome to my students who invariably are looking for the one right answer. I have to tell them that there is no one right answer. If the decomposed functions accomplish the goals of the system, then it is correct. In general, there is a one-to-one correspondence between an atomic-level function and a program (such as a Visual Basic form or report). Once application functionality has been defined, you are ready to think in terms of how to make it happen. There are two aspects to this: defining the data that needs to be captured and a description of how the data will be manipulated. (In Chapter 2, I discuss how to design a database.)

Encapsulating Functionality
Another source of confusion is the concept of encapsulating functionality. All that this intimidating word (encapsulation) means is embedding functionality into another object such that its behavior is exposed but its integrity is protected. Okay, maybe that is not so easy to understand. Consider the database server. When you send an SQL statement to the server asking it to update the address of a customer, you are not doing the actual update, the server is. The database engine edits your SQL statement for validity, ensures that the customer number is valid, verifies the state or province and postal code, then performs the physical update. It places an entry in the transaction log and informs your program whether the update was successful. You cannot access the underlying data or functionality directly; it is encapsulated within the database server itself. In designing your system, you seek to do the same thingencapsulate your

application logic. If you need to perform customer maintenance, you seek to isolate the data and logic. Isolating the data allows you to guarantee the integrity of the data. For instance, in updating a customers address, you might want to validate the state or province code as well as the postal code. Therefore, you could embed that logic into an object. By embedding the data in that object, you force all requests to update the data through the objects methods. This embedding is a recurring theme throughout this book, and I expand upon this concept in the following section.

Partitioning The Application

Hand in hand with encapsulating functionality, partitioning the application allows you to reuse objects and locate those objects at their proper place in a multi-tiered client/server application. In partitioning an application using Microsofts DCOM, you break each function into three discrete sets of services: user, business, and data. This partitioning can be seen in Figure 1.5. The client interface is localized to the client computer under user services. The logic to process a customer record and an associated address record is localized to the middle tier under business services. The database server handles data services.

Figure 1.5 An application broken into three tiers of services. To build such an application, you need to create two (or more) Visual Basic applications. The client services portion is actually rather simple, needing only to provide an interface to the user and the knowledge of how to communicate with the business services tier. The business services portion is a separate program running remotely (from the client) to which multiple clients connect. Visual Basic provides a rich set of tools to build these portions. However, the hard work has already been done for you.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Distributed COM
When you build an ActiveX component and then use it in a VB project, COM provides a proxy and a stub for each side of the connection between your application and the component. A proxy is an object provided by COM that runs in the same memory space as the user. It packages together parameters and acts as a bridge between the two objects. If your application is communicating with an ActiveX component, the proxy runs in the memory space of your application. It collects arguments to be passed to the ActiveX controls methods and properties and communicates with the components stub. The stub, on the other hand, runs in the address space of the receiving object, unpackages the parameters received from the proxy, and passes them to the component. This communication works both ways, as shown in Figure 1.6.

Figure 1.6 COM and DCOM automatically provide proxy and stub services, allowing two objects to communicate with one another with little to no intervention on the part of the developer. The workings of DCOM are almost identical. The application houses a remote automation proxy, which communicates with the components remote automation stub. DCOM also adds network transport services in a manner that is invisible to the application, making the location of the component on the network irrelevant to the client. This provides for tremendous flexibility in design. Visual Basic (Enterprise Edition) provides tools to register components. I review step-by-step in Chapter 10 the entire procedure for implementing the DCOM.

In the remainder of this chapter, I review some of the key concepts of the underlying network architecture. Although the client/server developer does not need to be a network engineer, it is helpful to understand the basics of the physical model upon which the application is based.

Network Topologies And Architectures

The backbone of any client/server system is the network on which it is installed. It is the network that moves data from the server to the user. The network might be a 10Mbps Ethernet or it might be the Internet itself. Although network design is well beyond the scope of this book, the Visual Basic client/server developer should have a grasp of how his or her data is moved in order to properly model the application. The expression network topology refers to how the network is physically laid out. In contrast, the expression network architecture refers to how the network is implemented on that topology. In the following sections, I briefly discuss the more common topologies and architectures.

Peer-To-Peer Networks
A peer-to-peer network is one in which the computers are connected to one another directly in series. Each PC typically contains a LAN adapter or Network Interface Card (NIC) with two jacks. Typically, these jacks accept RJ-11 pins (similar to the connectors on your phone cord). A line is run from one computer to the next serially. Although it is simple to set up, such a network is typically slow because data passes through each computer. It is vulnerable to any individual workstation being powered down or crashing, and because the network has no server, there are minimal security constraints. A Windows 95 or Windows 98 network is a peer-to-peer network.

Star Networks
The star network topology implies a network where all computers are connected to a hub (a hub is a box similar to a telephone switchboard containing minimal intelligence). Multiple hubs are connected to one another. Star networks are simple to maintain but offer minimal security constraints. If a hub goes down, the entire network goes down.

Ring Networks
In a ring network, all computers are connected via a continuous cable. A multiplexing unit monitors the network, allowing only one packet to circle the network at a time. The packet is continually sent from computer to computer. Each computer examines the packet to see if it is addressed to that workstation. If a computer needs to send information, it waits for the packet and attaches a request to transmit. The multiplexing unit gives permission to each workstation in turn in a procedure that is akin to a parliament (giving each member a turn to speak but also allowing for higher priority speeches such as error messages). If the packet is accidentally destroyed, the multiplexer recreates it.

IBMs Token Ring architecture runs on a ring technology. It has the advantage of being very reliable, but because of the extra overhead of packet monitoring, it is somewhat expensive to maintain and does not scale to a large number of users.

Bus Networks
The bus topology connects computers in a single line of cable. Although each computer is typically connected to a hub, the internal wiring of the hub still connects each computer serially. When a workstation wants to send a packet of information, it listens for the line to be clear through a process known as electronic sensing. Only one packet can travel on the network at a time, and if two computers happen to transmit at the same time, a packet collision occurs. Each workstation listens for such collisions and, when they are detected, waits a random amount of time and then resends the packet. The most common implementation of bus technology is Ethernet. Although fairly simple and inexpensive to install and maintain, the network is vulnerable to any breaks in the cable. Further, as more users are added to the network, performance can degrade drastically. Still, Ethernet is by far the most common architecture. To maintain satisfactory performance, network engineers break a large Ethernet network into smaller LANs connected via bridges.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Network Protocols
In the 1980s, when we logged onto a bulletin board, we said the two computers were shaking handsagreeing on how to communicate with one another. Today, we more accurately refer to this process as agreeing to a protocol. A protocol is a low-level language with which two or more computers communicate. Imagine you send a 2MB file to a friend across the company or across the globe. The file is broken into packets. Each packet has various pieces of information in it: Destination addressThe computer to which the packet is being sent. Usually a from addressThe computer or person from which the file was sent. ChecksumA mechanism by which the receiving computer can verify the integrity of the data being sent. Sequence numberA number that enables the receiving computer to reassemble all the packets into one file and to tell the sending computer which packets to resend in the event of transmission errors. DataThe actual data being sent. A number of protocols are commonly used in organizations today. Most are based on the Open System Interconnection (OSI) Reference Model, which defines standard network layers of communications. An in-depth discussion of these protocols is beyond the scope of this book. However, protocols that adhere to this standard can generally intercommunicate with relatively inexpensive translation hardware. In general, you can think of a network protocol as similar to an agreement at the United Nations to speak in one certain language that everyone understands. The most common protocol in use is TCP/IP, the language of the Internet. The TCP (Transport Control Protocol) portion refers to how a packet moves from

application to application. The IP (Internet Protocol) refers to how data is moved from computer to computer. Although it is used on the Internet, TCP is widely implemented on various internal networks within organizations. An in-depth discussion of protocols is beyond the scope of this book. For now, understand that different networks employ different protocols and often employ more than one protocol simultaneously. IPX/SPX is the native protocol of NetWare networks. IBM LAN Managers native protocol is NetBIOS. Other protocols that you may run into range from NetBEUI (Microsoft Windows NT) to AppleTalk (Apple Macintosh).

Network Operating Systems

Like the individual client PC, the network server runs an operating system, usually called the Network Operating System (NOS). It is the NOS that is responsible for network traffic, security, shared services (such as print and file services), and group services (such as email and calendar functions). The choice of NOS can have a dramatic impact both on performance and on productivity. Throughout this section, I deliberately mix consideration of NOS with the database servers operating system. Strictly speaking, the database platforms operating system is independent of the networks NOS. However, the considerations for one are often the same as for the other. Further, I strongly urge every organization to minimize the number of different components. If you are going to use Novell NetWare as your network operating system, I highly recommend that you consider it for your database server as well. The network itself consists of one or more servers, each providing one or more services. For instance, a given computer may be responsible for both email and print services. The best choice for overall NOS may not be the best choice for your database servers operating system. In evaluating what NOS to use, you need to ask a few questions: What NOS does my organization currently use? Often, it is best to stay with what you have. A large-scale client/server migration introduces a lot of change as it is; it may not be the best time to also introduce a new network operating system. What expertise does my organization have on staff? NOS skills are often hard to find. If your staffs skills lie in the area of Windows NT, you might not want to even consider a Unix or NetWare platform unless you are confident that you can hire people with the necessary skill sets. What is the overall load on my network? Some NOSs lend themselves to certain types of environments better than others. When you consider the database server, for instance, Oracle presents a prettier face on Windows NT than it does on Unix. However, the Unix platform under many conditions will outperform the equivalent NT environment. How many resources can I dedicate to administering my network environment? For organizations that cannot afford a great deal of devoted resources, the highly graphical nature of the administrative tools in Windows NT may more than offset the tedious command-line

interface of a Unix platform. In the following sections, I review the more common network operating systems that you are likely to encounter.

Novell NetWare
Once the king of the hill, NetWare has seen its market share steadily decline relative to the Microsoft Windows NT juggernaut in the past few years. Whether the declining market share is deserved is a subject for conjecture. NetWare is known as a dedicated operating system, meaning that the server can run no programs other than the operating system itself. Other programs you want to run must actually be part of the operating system. You accomplish this arrangement by compiling the other program (email or database are examples) into a NetWare Loadable Module (NLM). When the NOS loads, any NLMs load as well and become part of the operating system itself. There is a clear advantage to this system in that, because the NLMs are part of the operating system, they run much faster than they otherwise might. They can take advantage of operating system services (such as file I/O) without going through intermediate drivers and link libraries. On the other hand, if the application crashes or is otherwise poorly behaved, the operating system will also crash (because the application and the operating system are one and the same). A few years ago, Novell sought to expand its presence in the workplace with the acquisition of Quattro Pro and WordPerfect. Many in the industry believe this was a serious tactical blunder, removing Novells focus from the network arena, where it had always been the market leader, and allowing competitors, particularly Microsoft, to make inroads. Novell eventually sold those products to Corel, but by then, the damage had been done. Regardless, NetWare has a large presence in the corporate arena and a strong suite of tools. An ironic side benefit of NetWares market erosion is a relative abundance of talent to support the product set.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

IBM LAN Manager

Once upon a time, IBM and Microsoft codeveloped LAN Manager. Unlike NetWare, LAN Manager is not a dedicated operating system. It can run other programs independent of the operating system. Coupled with OS/2 and various DB2 tools, it forms a strong platform for shops with IBM mainframes. Never a large player in the network arena, IBM nevertheless remains a healthy company with a wide variety of tools. OS/2 itself has been unfairly maligned for its lack of software products because it is capable of running a large subset of Windows applications. Arguably, it is a more stable platform than some Windows platforms (although probably not as stable as Windows NT itself).

Banyan Vines
Banyans product has never garnered as large a market share as it probably should have, always operating in the shadow of other players in the marketplace. Like Novell NetWare, Vines is a dedicated operating system. Although it is a solid platform for file services, I am not aware of any major database products that run on it.

Windows NT
The Microsoft juggernaut seems to roll on with Windows NT. The operating system was developed to replace Microsoft LAN Manager, which was jointly developed with IBM in the 1980s. Like Windows itself, NT did not enjoy widespread acceptance for several years. NT version 3.51 seemed to have been the right answer for organizations that began to embrace it as a viable network platform. NT version 4 is a robust operating system, although arguably not as powerful or scalable as Novell NetWare 4.x. In particular, the word among network professionals is that Windows NT 4 seems to hit a wall at about 1,100 users. Likewise, its directory services are weak in comparison to Novell

NetWares. However, NT 5 (in beta as of this writing) seems to answer many of these complaints. Windows NT is a nondedicated server, meaning that other programs can run along with the NOS itself. Microsoft traded some resource intensiveness (Windows NT really needs more memory and CPU speed than most other NOSs to run comfortably) against a high degree of stability. The operating system spends a fair amount of CPU cycles enforcing integrity of operations. For instance, it does not permit any program to directly address any hardware, and it also prevents applications from writing to memory outside of their own address spaces. Windows NT sports a solid set of graphical, if somewhat complicated, tuning devices to maximize performance. Reportedly, Windows NT represented 50 percent of all network server licenses as of February 1998.

Where To Go From Here

This chapter has been a high-level overview of the nature and evolution of client/server, an introduction to Visual Basics role in the network, and a review of the underlying network architectures. The next two chapters deal specifically with the database server portion of the network, whereas most of the remainder of the book deals with Visual Basics role in the creation of two-tiered and three-tiered client/server applications. The heart and soul of client/server are, in fact, the database and database access techniques. Therefore, if you are unfamiliar with database design, you will want to read Chapter 2. If you are not familiar with SQL usage for retrieving and updating the database, you will want to read Chapter 3. Experienced database developers may want to skim Chapters 2 and 3 and begin with Chapter 4, which is a high-level overview of Visual Basic and data access. Many of the following chapters concentrate on specific techniques, and Chapter 4 attempts to steer the developer to which of those techniques, such as DAO, RDO, or ADO, is appropriate for his or her situation. Many fine books are devoted to small subsets of the subjects discussed in this chapter. My recommendation if you do seek out these texts is to find ones written in the past few years to ensure that the most modern approaches are reviewed.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Chapter 2 Relational Database Management Systems

Key Topics: Relational databases vs. RDBMSs How a relational database is organized Database integrity constraints Database design Data Definition Language Data Control Language

In this chapter, I guide you through the essentials of what is the heart and soul
of any client/server system: the database. Although a Visual Basic program can run against almost any back-end database, it is the Relational Database Management System (RDBMS) that brings the server to client/server. When training students or corporate clients in client/server development techniques, I stress that properly designing a database is 80 percent of the battle in constructing and deploying a successful client/server application. With a well-designed database, you have a good chance of deploying a successful client/server application. On the other hand, if the database is poorly designed, all the development skills in the world cannot create a robust, reliable client/server application. Therefore, no skill is more important in client/server development than properly designing a database. A properly designed database accomplishes three essential things for the client/server developer:

Through a properly normalized data schema, complete with intelligently designed integrity constraints, the database can assume much of the burden of validating basic business rules, such as requiring that an order have a valid customer. By using the referential integrity tools made available by the database, you minimize or eliminate altogether the chances of data corruption problems. By moving data update, retrieval, and validation processing to the server, you enhance application performance by minimizing network traffic and by moving some of the intensive processing from the client to the server (which, presumably, is a high-end machine optimized for the load).

A Note About Object-Oriented Databases

New to the database scene in the past couple of years is the emergence of object-oriented databases (OODs). An OOD does not store its data referentially; instead, it employs object-oriented techniques to store and access data as discrete objects. The technology is, as yet, immature and sales are a tiny percentage of the overall database market. None of the major database vendors offers a true object-oriented database, and it will be years, if ever, before the OOD encroaches upon the more tried and true RDBMS. Some vendors, most notably Oracle and Informix, have introduced object-oriented extensions to their relational products. These are referred to as hybrid databases. With a hybrid database, you can define certain columns as containing objects and employ extensions to the database that understand how to manipulate those objects. For instance, Informix and third parties market Data Blades, which make it possible to store and manipulate specific types of objects in a relational table. One vendor sells a Data Blade that enables a table to store facial images and then search by those images. Unfortunately, hybrids tend to be slow in indexing and accessing the objects that they store. Recently, Sybase and IBM have both announced object-oriented extensions to their database products. In the following pages, I overview some of the popular relational databases, discuss the theory of relational database design, and conclude with an overview of SQL statements, which implement intelligent database design.

Relational Databases Vs. Relational Database Management Systems

Before embarking on a survey of available relational database products, it is important to draw a distinction between relational databases and relational database management systems. The latter implies an engine that is running independent of any client program.

Any database that organizes data in related tables and that can enforce those relationships via referential integrity can lay claim to being a relational database. The database engine manages the data and enforces the integrity constraints. If there is no engine, the responsibility falls on the application program to perform those roles. Consider a Microsoft Access database that I call demo. It is unquestionably a relational database. However, assume I put the database file on a server and then access it from multiple client programs. You will see that there are actually two files: demo.mdb is the actual database file (MDB stands for Microsoft database file), and demo.ldb is a file used to manage record locking (the LDB is for lock database file). Now, suppose you want to see all the customers from California. Your application generates an SQL statement that looks something like: SELECT * FROM CUSTOMER WHERE STATE = 'CA' Assume there are 50 customers from California on a table of 5,000 customers. All 5,000 records are returned to your application, and the application itself then reads through those records to find the 50 that you need. The application receives 100 times as much data as required. It is easy to imagine how this activity bogs down network performance, especially if 10, 50, or 500 client programs all make the same types of requests. This inefficient data processing occurs because the Access database is just a file with no engine to process the data request. Contrast this example with one involving an RDBMS, such as Oracle or Microsoft SQL Server, which are true database engines. In this scenario, the SQL request is sent to the RDBMS, which processes the request on the server. The server sends back to your application only the 50 records that are required. Network traffic is reduced by 99 percent. The moral? A wolf in sheeps clothing is still a wolf. A database might be relational (such as Microsoft Access) or present an SQL front end (such as dBase), but if it is not a relational database engine, it is not an appropriate choice for client/server development.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Considerations In Selecting An RDBMS

As shown with the simple Microsoft Access illustration, your choice of RDBMS can have a dramatic impact on both the performance and the maintainability of your application. Choosing the RDBMS is then a critical decision point in your overall client/server strategy. It is easy to jump to the conclusion thatassuming money, hardware, and so on are not an issueyou should simply purchase the most powerful engine on the market. Unfortunately, things are seldom that clear-cut. The marketing hype that passes for serious advertising from the various database vendors doesnt lend much light to the discussion either. Open any database magazine and you can see ads from Sybase and Oracle both proclaiming how each clobbers the other in side-by-side comparisons. Visit either vendors Web site, and the same guidance is presented in more detail. In truth, the claims of all the vendors can probably be substantiated; it is the assumptions each makes about operating environments that renders one more powerful than the other in one test versus another. In any case, the most powerful engine may also be the most difficult to administer or it may simply be overkill for your organization. In the following pages, I provide some general considerations in selecting an RDBMS for your client/server environment. These considerations include: Network operating system Typical database usage Number of users and volume of data Database administration Database cost Vendor stability and reputation Of necessity, it is not possible to reach any definitive conclusions because the

proper choice is highly dependent on each organizations needs and environment. Further, although I have, in fact, used most of the products that I discuss in the next section (A Survey Of Available RDBMSs), I have not used all of them, and any specific recommendations are likely to be somewhat subjective. Of course, your organization may have already deployed an RDBMS that is serving you quite well. In that case, get a cup of coffee and then move on to the next section.

Network Operating System

Perhaps the first consideration in choosing a database is the network operating system (NOS) upon which it is to run. Few vendors support all server operating systems, so if you are constrained to choosing from vendors whose product runs on, say, OS/2, the field of viable candidates is automatically reduced. It is possible, of course, to run a different operating system for your database server than you do for other portions of your network (and many organizations do just that). However, my recommendation is to always keep the environment as simple as possible. That recommendation includes avoiding running multiple operating systems across the network if possible. Along the same lines, you may have other environments with which you need to integrate. If you need to run your client/server applications with mainframe data, you might want to consider products that make this easier. IBMs DB2 running on OS/2 is a popular choice for shops with this need. XDB Systems also markets a capable database product that can run against DB2, IMS, and VSAM data on the mainframe. An additional consideration with your NOS arises with the subject of multiple processors. Increasingly, high-end servers come configured with two, four, and even eight processors. However, if the NOS is incapable of using the processors, the database cannot use the processors. For instance, Novel NetWare 3.1x runs quite comfortably on an Intel-based machine running dual processors; it simply ignores the second CPU. NetWare 4.x does take advantage of the additional processors. Assuming your NOS supports multiple processors, your database may not (or you might need to tell the database to use those processors).

Typical Database Usage

In general, database applications tend to fall into two categories: transaction processing and decision support. An online transaction processing (OLTP) system, such as an order-entry application or a banking system, is typically transaction oriented. Often, larger systems may need to process many transactions per second (TPS). On the other hand, a decision support system (DSS) is typically a large database upon which sits all or a large subset of corporate data, which people analyze to make informed management decisions. The system may have many

thousands or millions of detail records that are queried and summarized but are not typically updated. Executive information systems (EIS) and data warehouses are examples of these types of applications. Either way, the needs are quite a bit different than those for an OLTP system. If you have a busy OLTP system, you might want to consider such factors as how a given database performs record locking. Sybase, for instance, uses what is called page-level lockingall records on a database page are locked when an update is to occur. Oracle, on the other hand, performs row-level lockingonly those rows that are to be updated get locked. The record locking has implications if several users are trying to update different records on the same page. With the Sybase product, the users have to wait until the first user has released the lock on that page. Microsoft SQL Server 6.0 performs page-level locking, whereas SQL Server 6.5 does pseudo row-level locking: The rows immediately before and after the row to be updated are also locked. SQL Server 7.0 performs true row-level locking. On the other hand, the nature of your data may be such that a page contains few rows, which makes the issue of row-level versus page-level locking less important. If you are planning a DSS, you should consider how well the database handles very large volumes of data. Most of the major vendors support some sort of variation on massively parallel processing (MPP), where a long-running query is executed in multiple threads to better handle the intensive querying and to speed up processing. Determining which database will best handle your needs for an OLTP or a DSS is not an easy undertaking. My recommendation is to talk to peers who have used the products in environments similar to your own. In addition, check out the computer press for its independent benchmarking results. Lastly, consider searching the Internet (but avoid the vendors own biased claims).

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Number Of Users And Volume Of Data

Considering both the number of users and the volume of data actually goes hand-in-hand with the prior discussion of database usage patterns. If you have two users who need access to only a few thousand records, by all means, consider a product such as Microsoft Access. If, on the other hand, you have many thousands of simultaneous users chasing terabytes of data, you need to examine the beefier databases such as Oracle, Sybase, Informix, Microsoft SQL Server, and DB2. If your needs are greater than what Access can satisfy but are perhaps less than what a database such as Oracle warrants, you might want to consider Sybase SQL Anywhere. My experiences with this product are that it can satisfy the needs of up to 50 users in a moderately heavy OLTP environment. For low volumes of data, you may be able to push the number of users to 100. For moderately sized databases, it is a viable platform for a DSS but does not offer the tools that an Oracle or Sybase SQL Server offers for massively parallel processing.

Database Administration
My own subjective experience is that Oracles database engine is probably the most powerful in a high-transaction environment. However, this power comes at the expense of more complicated administration. To squeeze the best performance out of a complex database product such as Oracle, the database administrator (DBA) needs to monitor and refine a variety of operating parameters, such as defining the frequency of checkpoints, partitioning the data, and so on. (The particulars of database administration vary from database to database and are beyond the scope of this book.) How much administration is required also depends on the nature of the applications that the database supports in your own organization.

Although the good news is that most of the vendors (including Oracle) are simplifying administration, a poorly configured database yields poor performance. Consider Microsoft SQL Server running on a Windows NT server with 1GB of RAM. By default, SQL Server uses only the first 16MB of RAM. A low-end RDBMS product that uses most or all available memory performs much better than a high-end RDBMS that has not been configured to take advantage of the available resources.

About Database Administration

In talking to two seemingly identical organizations about their experiences with the Brand X RDBMS, you are likely to encounter two wildly different reports on how well the database performs. As is true in so much else in life, the database is only as good as its weakest link, and often, that weak link may well be the DBA. Under most processing loads, a properly configured environment should yield good performance with any of the major RDBMS engines. The environment includes: The serverThe computer on which the database is running should be a high-end machine loaded with plenty of memory and disk space and possibly multiple processors. Dont skimp on this critical piece of the RDBMS performance puzzle. Similarly, use common sense when configuring the database server. The RDBMS should normally be the only software running on the server other than the operating system. Use common sense when configuring the computer. For instance, spreading the database over two 10GB disk drives yields better performance than over one 20GB hard drive (because you have two heads reading and writing data, which requires less head movement). The networkData can only be sent to and received from the RDBMS as fast as the network is capable of moving it. If you are running a 10MB Ethernet network, consider moving to a switched 100MB Ethernet architecture. Your capacity will increase at least tenfold. If the network is bogged down by too many users, performance can degrade rapidly as a result of escalating packet collisions. (Refer to my discussion of network architectures in Chapter 1.) Consider splitting the network into smaller LANs connected via bridges. Place the database server on that LAN where it will receive the most use (thus minimizing traffic that has to traverse the bridges). The service modelThis can be a critical source of performance degradation, creating the illusion that the fault is with the RDBMS. I discuss the service model in Chapter 1, but for purposes of this discussion, placing the proper type of processing at the proper place on the network can mean the difference between good performance and poor performance. The database designA poorly designed database can not only cause potential data integrity problems, but it can also yield horrific performance problems. The overuse and underuse of indexes both cause undesirable results. An over-normalized design might force the database to join together too many tables. An under-normalized design might

require the database to work too hard to locate the desired data or force it to read through too many records to return the desired result set. The application data logicThe application (or user) can easily throttle performance with poorly phrased queries that do not, for instance, correctly specify table joins. Consider a department table that has 50 rows and an employee table that has 500 rows. A list of employees showing their departments should, of course, produce 500 records. If the tables are not properly joined, you might end up with 25,000 records instead! Now, consider a program that allows the selection of an employee for maintenance. Inexperienced developers invariably seek to present a drop-down list box containing all 500 employees. That means that each time the user wants to see one employee, the database must return 500hardly the purpose of client/server. Worse, what if the company has 5,000 employees? You can see where performance would, again, rapidly degrade through no fault of the RDBMS. The RDBMSThe RDBMS itself can easily be so badly maintained as to produce unbearable results when even minimal performance tuning might yield improvements on the order of a 90 percent reduction in response time. Study the tuning manual that comes with each database. Periodically reorganize the databasea task that is similar to defragmenting your hard drive. Ensure that the database is allowed to use all of the resources (memory, disk, and so on) available. Update the system statistics frequently (system tables maintained by the database that give the query optimizer information about the types and volumes of data that is stored on the tables).

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Database Cost
Although I personally hate to see the cost of an RDBMS become a major consideration, the reality is that some shops do have to live within certain monetary constraints. A database engine such as Oracle might run into a licensing fee of many thousands of dollars, whereas a product such as SQL Anywhere might cost only several hundred dollars. It should be noted than many vendors will wheel and deal on price negotiations. Be careful when comparing prices. A visit to Oracles Web site shows you that the licensing fee to run the RDBMS on a single server is $1,450 (at the time of this writing). The fine print near the bottom of the page reveals that this is for five concurrent users. Most organizations running Oracle have somewhat more than five users. When factoring costs, you need to consider the number of servers upon which the database is to run, the maximum number of concurrent users, the maintenance fee, and quite possibly a separate telephone support fee. The maintenance fee is often 12 to 18 percent of the licensing fee annually. Telephone support can easily run into thousands of dollars per year. A final consideration in database cost is the so-called cost-of-ownership term so often bandied about when discussing PCs. Above and beyond licensing, maintenance, and support costs, you might want to consider that the hardware requirements of the different RDBMS products differ (ignore the stated minimum requirements and concentrate on the recommended configuration) as do the internal support costs. A talented DBA can tremendously enhance the performance of the RDBMS (while commanding a salary that would make the corporate CEO blush).

Vendor Stability And Reputation

You might want to consider the financial stability of the RDBMS vendor. As of this writing, Sybase was posting large losses and laying off 10 percent of the companys employees. There were also questions regarding Informixs financial results. Such news may be of concern in terms of a companys long-term viability (will they be in business five years from now?), the organizations ability to develop and improve the product, and its ability to support the product. (Please note that although Sybases woes cannot be construed as good news for any customers, I personally do not predict an organization that large will go under. The Sybase product set is excellent and the company continues to boast a top-notch technical development staff across a diverse product line.)

A Survey Of Available RDBMSs

In the following section, I briefly mention some of the more popular RDBMSs available, including: Microsoft Access Sybase SQL Anywhere Sybase Adaptive Server Enterprise (System 11) Oracle Informix Microsoft SQL Server IBM DB2 XDB

Microsoft Access
As discussed earlier in this chapter, Microsoft Access is not truly an RDBMS. However, it is a popular development back end and offers some fairly sophisticated features, such as true database replication. On the other hand, it does not offer a complete SQL implementation (see Appendix B for a comparison of Jet SQL to ANSI SQL), and because it is not truly client/server, its performance pales against even the least expensive of the true client/server database engines. Nevertheless, because it is popular and because Visual Basic offers easy support for Access, I use it in examples throughout the book. Access Version 8 is bundled with Microsoft Office 98 Professional and is available as a standalone product. However, you can use Visual Basic to write to and read from Access databases even if you dont own Access itself. For more information on Microsoft Access, visit the Microsoft Web site at www.microsoft.com/access/.

Sybase SQL Anywhere

Sybase SQL Anywhere was formerly known as Watcom SQL. (The product was renamed when Sybase acquired Powersoft Corporation, which owned Watcom.) Because it is both inexpensive and widely bundled with various development tools, it is a popular RDBMS. The product is sophisticated in its

implementation, has a complete implementation of SQL (I discuss the ANSI SQL standard in Chapter 3), and offers full support for Transact-SQL. (T-SQL is a dialect of SQL used by Sybase SQL Server and Microsofts SQL Server 6.5.) The product requires almost no administration but does not scale to the enterprise level. The current version of SQL Anywhere (as of this writing) is 5.5. It runs as either a server or a standalone (single-user) RDBMS on Windows 95, Windows NT, Windows 3.x, OS/2, MS-DOS, and NetWare. For more information, visit Sybases Web site at www.sybase.com.

Sybase Adaptive Server Enterprise

Sybase Adaptive Server Enterprise (System 11) runs on Windows NT and on various Unix platforms. Adaptive Server is an enterprise-level server offering robust performance under a variety of configurations. Despite the popular misconception that the relational database was pioneered by IBM, it was actually Sybase that was first to market in the late 1970s. For many years, Sybase was number two in sales (behind Oracle), but a reputation for customer service woes and a couple of years of poor financial results led to the company falling behind Informix in database sales volume. Still, Sybases database server products are powerful, and the company enjoys a large installed base of users. Like its little brother, SQL Anywhere, Sybase Adaptive Server uses a dialect of SQL known as T-SQL. Throughout various sections of the book, I present examples using T-SQL. Sybase has joined the move toward hybrid databases with the addition of Data Stores to store and manipulate objects. For more information, visit the Sybase Web site at www.sybase.com.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Oracle
Oracle Version 8 runs on a wide variety of platforms, including OS/2, NetWare, Windows NT, and Unix (including Sun Solaris, IBMs AIX, HP-UX, and SCO). A personal edition of Oracle runs on Windows 95/98. Oracle has long enjoyed a reputation as the most powerful database on the market with advanced features optimizing large volume data processing, distributed data management, Web connectivity, and robust OLTP. Still, the gap in processing capabilities between Oracle and its competitors has eroded and perhaps disappeared under certain circumstances. Although Oracle was once considered a pricey product, competition has driven the costs down. Oracle has been the number one vendor for many years. Whereas the two Sybase products offer extensions to SQL known as T-SQL, Oracle uses a robust dialect known as PL/SQL. As with T-SQL, I also present samples in this book using PL/SQL. For more information, visit the Oracle Web site at www.oracle.com.

Informix
For years, Informix was a distant runner-up in terms of sales to the two giants, Sybase and Oracle. In recent years, the Informix product line has grown in sales until it is now solidly entrenched as the number two vendor of RDBMSs. The Informix products scale well from the department to the enterprise level and offer well-integrated Web connectivity options. Informix sells versions of its enterprise server that run on most types of Unix and on Windows NT. For more information, visit the Informix site at www.infor-mix.com.

Microsoft SQL Server

Microsoft SQL Server was originally jointly developed with Sybase SQL Server (now Sybase Adaptive Server Enterprise). SQL Server 6.0 was the first large-scale departure from its Sybase roots. Version 6.5 saw major performance boosts and Version 7 looks even better. Unlike the other database offerings, MS SQL Server runs on Windows NT only. Although that obviously might be limiting, it also offers some advantages. Historically, the RDBMS engine has had to be written with a separate layer to interact with the operating system, perform its own thread management, and so on. Because MS SQL Server is written to run on only one operating system, it can be more closely integrated with the operating system. SQL Server has been rapidly gaining in popularity, although it probably has yet to achieve parity in performance with the most powerful of the products listed here. That performance issue might not be a consideration for your organization if you are not attempting to connect several thousand users or manage several terabytes of data. Like the Sybase products, SQL Server uses Transact-SQL, although Microsoft has begun moving in its own direction and away from the Sybase standard. SQL Server probably offers the easiest-to-use administrative tools. For more information, visit the Microsoft Web site at www.microsoft. com/sql/.

IBM DB2
IBMs relational database, DB2, runs on a wide variety of platforms, including most types of Unix, Windows NT, OS/2, and Windows 95/98, as well as VAX VMS and IBM MVS and VSE. Although in the past its network-level database products have not fared well in comparison to the products of other vendors, its latest version has tested well on Windows NT. DB2 is of particular interest to those organizations that need a high degree of compatibility or interoperability with their mainframe database. For more information, visit the IBM Web site at www.ibm.com.

XDB
XDB Systems offers a well-done RDBMS that offers a high degree of compatibility with IBMs DB2 and, like DB2 on the network, would be of interest to shops also running a mainframe. By itself, XDB is a capable RDBMS. In addition, it has extensions that enable the querying of DB2 and IMS databases and VSAM files on the mainframe (though VSAM files cannot be updated). My own experiences with XDB have been positive, although I feel it is probably not as powerful as the very top-end RDBMSs. I am no longer nervous about the companys long-term prospects because it has announced its acquisition by Micro Focus, makers of COBOL tools for the network. This is a logical pairing because Micro Focus tools have always

integrated well with XDB. The choice of XDB is appropriate for shops interested in leveraging legacy COBOL applications by migrating them to a client/server environment. For more information, visit XDBs site at www.xdb.com.

Other Relational Databases

Space precludes the listing of all RDBMS vendors, even though I have attempted to cover most of the major vendors. Any listing or failure to list a given product should not, of course, be construed as a positive or negative comment about a product. Besides RDBMSs, other relational databases such as Lotus Approach might be logical alternatives to Microsoft Access. Additionally, although not organized relationally, front ends or ODBC drivers (I discuss ODBC in Chapter 3) allow the access of other file formats in a relational manner. These formats and products include Btrieve, dBase, FoxPro, Paradox, and so on. Any of these may be suitable for smaller-scale development efforts.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Database Organization
A relational database, unlike sequential files, organizes all of its data in a single file. Figure 2.1 illustrates a nonrelational and a relational design of a database with customers and orders. Both cases have four indexes: two on the customer data and two on the orders data. The nonrelational approach illustrates how you might build the database with dBase files: There are a total of six files with no true logical association between them. For instance, nothing stops me from opening the ORDERS.DBF file and adding an order even if no valid customer is associated with that order. Even worse, nothing stops me from exiting to DOS and typing DEL CUSTOMER.DBF. After doing so, I have two indexes serving no purpose and a whole lot of orders for which I have no clue about the customers.

Figure 2.1 Nonrelational databases are typically a collection of un-associated files. Relational databases contain all of their tables and indexes inside a single file. On the right side of Figure 2.1 is a single file managed by an RDBMS, such as Microsoft SQL Server, containing the two files and four indexes. These files are actually stored as tables. Tables and indexes are both stored as objects within the database.

Tables
A database table is organized into rows and columns much like a spreadsheet.

The columns correspond to fields in a sequential file. The rows correspond to records. Figure 2.2 illustrates this with a simple implementation of a Customer table. Cust_No represents a column, whereas John Smith represents a row in a two-dimensional grid.

Figure 2.2 A relational database table is organized into columns and rows. Every row in a table must be able to be uniquely identified. In other words, there must exist something on each row in a table that is different from any other row on that table. This aspect that makes a row unique is called a primary key. On the Customer table, the Cust_No column is the primary key; there can be one and only one customer 101, one and only one customer 102, and so on.

A Note About Primary Keys

Technically, the RDBMS does not require you to ensure that each table has a primary key. However, not having one defeats the whole purpose of relational database design for reasons that will become apparent as we continue our discussion. A primary key is actually implemented as what is known as a unique index by the RDBMS. A table may actually have multiple unique indexes. (In addition to an employee number, you may also have a Social Security number, which is also unique.) However, there can be only one primary key on a table. The primary key may actually be more than one column on the table. You can, for instance, have an inventory table where the combination of the part number and date purchased columns together form the primary key. Such a key is known as a compound key. It is customary but not required that the primary key columns be the first columns on the table. With a single table, the relational model offers no particular advantage over a sequential file. However, you can expand upon the example to include another table: Orders. Again, the table is organized into rows and columns as shown in Figure 2.3. Notice the column containing the customer number. You can begin to see where the relational in relational databases comes in. The power of the database is achieved when rows and columns on tables are logically constrained by values stored on other tables; in this case, the Cust_No value stored on the Orders table must exist within the Customer table.

Figure 2.3 The second table, Orders, is related to the first table, Customer,

via the common column Cust_No.

Table Relationships
The RDBMS enforces the rule that the customer number on the order must already exist on the Customer table through the use of referential integrity. Specifically, you can create a foreign key on the Orders table that states that the Cust_No column must relate to a key on the Customer table. In general, the foreign key specifies that the values of a column (or columns) on a dependent table must exist uniquely on a master table. The Orders table is dependent on the Customer table because values on it must already exist in the master table. Sometimes this relationship is called child-parent. Figure 2.4 shows the relationship between the Customer table and the Orders table with the foreign key.

Figure 2.4 The Customer and Orders tables with primary keys and a foreign key enforcing their relationship.
TIP Foreign Keys And Primary Keys The foreign key on a child table is dependent on a key on the parent table. In other words, the value stored in a foreign key column on a child tablesuch as the Cust_No column on Ordersmust exist on the parent table, in this case, the Cust_No column on the Customer table. Some databases require that the foreign key relate to the primary key of another table. Others only require that the related key on the parent table be a unique index. Remember that a primary key is itself a unique index. If the related key on the parent table is, itself, a compound key, then the foreign key must also be a compound key. The foreign key must be the same number of columns as the related key on the parent table and each of those columns must be the same data type.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Database Integrity
When designing a database, it is important to remember that the foreign key is always defined as a constraint on the child or dependent table and that table is always part of a one-to-many relationship. I cover the details of this concept under Database Design later in this chapter. However, for now, you should understand that related tables are always related in a one-to-many manner. For example, an examination of Figure 2.4 reveals that for any one customer, there might be many orders. However, for any one order, there can only be one customer. Wherever a one-to-many relationship exists, the table that is the many end of the relationship is always the child table and there should always be a foreign key defined to enforce that relationship. I review how to create foreign keys on the database in Data Definition Language later in this chapter. I described a foreign key as constraining the values that a certain column can contain on a table. This is but one type of constraint that the database can enforce. A foreign key enforces referential integrity. Although databases differ in their capabilities, in general, integrity constraints can be placed in three categories: Referential integrityEnforces relationships between two tables. Domain integrityEnforces values that can occur in any column across an entire table. User integrityEnforces values that can occur in a column based upon values on the same row. Domain integrity and user integrity can be enforced using a variety of constraint types: The primary key, as mentioned earlier, enforces the requirement that a value uniquely identify a row on a table. This is an example of domain integrity.

A unique key requires that any value in a given row be unique within that column. For instance, a Social Security number might be defined as unique. A primary key is implemented as a unique key. A check constraint validates the contents of a column on a row at either the domain or the user level. A row may be defined as Not Null, meaning that the column is not allowed to be null. (Null is the absence of any value at all and is not the same as an empty string. I discuss null more thoroughly in Chapter 3.) You can also employ a check constraint, which validates that a columns values are absolutely restricted. For instance, you can create a check constraint that requires salary to be greater than zero. Because this rule applies to any row on the table, it is also a domain integrity constraint. On the other hand, you might define a check constraint, such as date of hire must be greater than date of birth (to avoid problems with child labor laws, of course), that is dependent on another column on the same row. This is an example of a user integrity constraint. I show the syntax for creating each type of constraint later in this chapter.

Data Types
Relational databases define each column to be a certain data type. Although a number is really just a number regardless of what database is storing it, the RDBMS vendors tend to differ somewhat in what they call those types. For instance, what Oracle calls number Sybase and Microsoft call numeric. Data types tend to fall into four categories: Character data Numeric data Date data Binary data

Character Data
Character data types can hold any alphanumeric values. These values are usually referred to as strings or string literals. Even if the string happens to contain a number, you cannot perform arithmetic operations on it. The two main types of character data types are CHAR and VAR-CHAR. A CHAR data type is a fixed-length field defined as follows: Cust_LName CHAR (21)

If Cust_LName holds the value Jones, it right-pads the field with 16 spaces to fill up the field. Fortunately, most RDBMSs do not require you to account for this when searching for values. Typically, the database automatically trims the longer of two search conditions: SELECT * FROM Customer WHERE Cust_LName = 'Jones'

In this example, the database automatically takes into account that the search term Jones is shorter than Cust_LName. The VARCHAR data type (called VARCHAR2 by Oracle), as its name implies, is a variable-length data type and looks like the following: Cust_LName VARCHAR (21)
TIP CHAR Vs. VARCHAR There is some debate about which is the better data type to use. Certainly, the CHAR data type can waste a lot of space in a database. This was an important decision when disk space cost many dollars per megabyte. However, disk space is cheap now and the database has to work harder to find columns where variable-length data is used. It is difficult to say what the performance is exactly, but I would speculate that the average database takes a 10 percent performance hit if it employs all variable-length character types. My recommendation is to use fixed-length CHAR data types for all but the biggest columns (such as a column to record comments). If you do end up using some VARCHARs, put those columns at the end of each row where the performance penalty is minimized.

In this case, the column holds up to 21 characters, but if the name Jones is stored, it is only 5 characters long.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Numeric Data
Unlike character data, numeric data can take many forms and vary more from RDBMS to RDBMS than other data types. In general, you will run into the following data types (refer to your vendors documentation for any specifics or differences in implementation): NUMERIC (S, P)S stands for scope and P stands for precision. Scope refers to how many digits the number is allowed to hold, and precision refers to how many of those numbers are to the right of the decimal place, as in the following example: Ord_Amount NUMERIC (11,2)

In this example, the largest number that can be stored is 999,999,999.99. Even if your number has no digits to the right of the decimal place, you can use only nine to the left of the decimal point. Thus, a column defined as NUMERIC (4,4) can be no larger than 0.9999. INTEGER is a data type that holds a whole number from 32,768 to +32,767. An UNSIGNED INTEGER is capable of holding positive whole numbers from 0 to 65,536. LONG numbers can be any whole number of approximately negative 2.1 billion to positive 2.1 billion. As with the INTEGER data type, your database may also support an unsigned variation, UNSIGNED LONG, which holds a positive whole number of 0 to approximately 4.2 billion. FLOATING POINT, SINGLE, SINGLE PRECISION, DOUBLE, and DOUBLE PRECISION are all variations on floating-point numbers. A floating-point number can handle either real numbers or irrational numbers (as well as whole numbers, of course). A real number is any number that can be represented in a fixed number of decimal

places, such as 1.25. An irrational number cannot be represented in a fixed number of decimal points. An example is 1/3, which is .333333 (to infinity). The database stores these numbers in scientific format. Single precision numbers (SINGLE or SINGLE PRECISION, depending on your database) are usually 4 bytes long. Double precision numbers are usually 8 bytes long, effectively making them able to handle a much larger range of numbers.

Number Storage On The Database

Although you and I represent numbers decimallythat is, using what is called base 10computers store numbers in binary format using base 2. If a number is stored in 2 bytes, there are a total of 16 bits with possible values of 0 or 1. Therefore, the largest possible number of values that can be represented without some sort of encoding algorithm is 216 or 65,536. With signed numbers, one bit is reserved for the mantissa to indicate a negative or positive number, reducing the potential number of values to 215. Thus, a signed integer field stored in 2 bytes has a range of 32,768 to + 32,767. Even in an 8-byte field, there can only be 264 possible values. Computers use what is the known as the IEEE format to store floating-point numbers in scientific format to get around this limitation. Although this does not allow the storage of an infinite range of numbers, it does allow the computer to approximate very large and very small numbers with a high degree of accuracy. Scientific formats for numbers consist of three parts: the root, the letter E, and the exponent: 144.34267E+4. In this example, the exponent +4 indicates that the decimal point should be moved four places to the right: 1443426.7. An exponent of 4 would indicate that the decimal point should be moved four places to the left, yielding the number .014434267. The DOUBLE data type generally supports an exponent of plus or minus 308, which accommodates the storage of a very large or very small number in a compact manner.

Date Data
Dates are encoded on the database as numbers and are, therefore, limited in their range. Oracles DATE data type, for instance, has a range of 1 January, 4714 BC to 1 January, 4712 AD. The major RDBMSs are generally Year 2000 compliant. However, you should consider that your front-end development language might not support the same range of dates. Visual Basic, for instance, has a range of 1 January, 100 to 31 December, 9999. Inputting dates into the database can sometimes be a chore. The format of the date that the database expects is usually a startup option (that is, it is specified as an optional parameter when starting the database engine on the server). Usually, you can override this setting in an individual session by using the SET command. Oracles default date format is: INSERT INTO

VALUES ('18-Dec-98') Sybase, on the other hand, allows dates to be entered as either 1998-12-18 (December 18, 1998) or 1998/12/18. To alter the default format, you can specify SET OPTION Date_Order DMY, which then causes dates to be interpreted in day, month, and year order.

Should You Store Objects In The Database?

Assume you are building a database for a legal firm handling many legal documents, such as contracts and wills. You might want to store those documents on the database where it is easy to associate them with specific clients. The down side, of course, is that it makes each row on the database very large, which could impact performance. An alternative is to simply store a string with the path and name of the document, such as \DOCUMENTS\CONTRACTS\J819980347.DOC. The down side of this approach, however, is that there is no longer any integrity constraint preventing that document from being moved from the \DOCUMENTS\CONTRACTS directory or even from being deleted. The choice depends on your needs, and you need to carefully evaluate the pros and cons of either approach. However, if you do decide to store the object directly in the database, make it the last column on your table where it will have the least impact on performance.

Binary Data
Generally, the only time you will need to store binary data is when you want to embed objects such as bitmaps or word processing documents into the database. You accomplish such a task with the BLOB (Binary Large Object) data type. Although some RDBMSs are adding object-oriented extensions to automatically handle these objects (refer to A Note About Object-Oriented Databases earlier in this chapter), you generally need to plan in advance how you will display and manipulate this type of data.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Database Design
The key to a solid client/server application is a solid database design. Two approaches have a considerable amount of overlap. Entity-relationship modeling seeks to identify entities, provide a unique identifier for each, determine relationships between those entities, provide a method for enforcing those relationships, and then assign attributes to those entities. This approach provides a logical design of the database where entities map to tables and attributes map to columns. The other approach is database normalization. This approach seeks to take an existing design and validate it against a number of design guideline rules to eliminate processing anomalies. You accomplish this task by reassigning columns to existing or new tables. Although entire books are written on the theory and practice of database design, I provide an overview of the essentials in the next few pages.

Entity-Relationship Modeling
With entity-relationship modeling, you seek to identify those entities that represent a business and eventually map them into a physical database design. Figure 2.5 shows a graphical representation of the process.

Figure 2.5 Summary of the entity-relationship modeling process. Lets assume that we are modeling a simple college. The first thing to do is

identify the business entities. An entity is something physical with a unique identifier. If we were modeling a mail-order company, typical entities would be Customers, Orders, Items, and so on. For our example, we have Students and Courses. Because an entity must be uniquely identifiable, we assign a primary key to each one and draw them on a piece of paper, as shown in Figure 2.6.

Figure 2.6 Step 1 of the modeling process. Next, for each pair of entities, we determine whether there is a relationship between the two. In this case, Students and Courses are indeed related. Therefore, we draw a line to represent that relationship, as shown in Figure 2.7. In our mail-order company, we would have determined that Customers are related to Orders (because customers place orders) and that Orders are related to Items (because items appear on orders) but that Customers are not related to Items. It might be helpful to place some wording on the drawing to describe how the two entities are related.

Figure 2.7 Step 2 of the modeling process. The next step is to determine the nature of the relationship. To do this, you need to ask yourself two questions for each related pair of entities: For each occurrence of the first entity, how many occurrences might there be of the second entity? In addition, for each occurrence of the second entity, how many occurrences might there be of the first entity? Using our example, for each occurrence of Students, how many occurrences of Courses might there be? If there can be only one occurrence of Courses per occurrence of Students, draw a single arrowhead next to the Courses entity. If there can be more than one, draw two arrowheads. If there can be zero occurrences of Courses for each occurrence of Students, also draw a 0 on the line just before the arrowheads. In our example, a student can take zero, one, or more than one course. For any given occurrence of Courses, how many occurrences of Students might there be? For purposes of this exercise, we will assume that it can be one or many but not zero. The result is shown in Figure 2.8. When you have double arrows at each end, the relationship is called a many-to-many relationship. The 0 drawn at the Courses end of the relationship is called a conditional relationship.

Figure 2.8 Step 3 of the modeling process identifies the nature of the relationship between the two entities. You are not permitted to have many-to-many relationships in a referential relationship. Therefore, you need to resolve this problem by creating a logical entity called an assignment entity or a relationship entity. Instead of relating the students and courses to each other, relate them each to the new entity, which we call Student_Courses. Because every entity must have a primary

key, we assign one to the Student_Courses entity. Whenever you create a relationship entity, its primary key is always a combination of the primary keys of the two other entitiesin this case, Stud_ID and Course_ID. A primary key that consists of more than one item is called a compound key. The relationship entity always has a one-to-many relationship with both of the original entities, and the new entity is always at the many end of the relationship (it is, therefore, the child entity). The revised diagram is shown in Figure 2.9.

Figure 2.9 Step 4 of the modeling process resolves any many-to-many relationships by creating relationship entities. Next, you need to assign foreign keys to enforce the relationships between the tables. Remember that a foreign key means that for each occurrence of a child entity, there must be a corresponding value in the parent entity. In entity-relationship modeling, the foreign key of the child table always corresponds to the primary key of the parent table. Also, remember that wherever a one-to-many relationship occurs, there is a foreign key relationship. Figure 2.10 shows two foreign keys in the Student_Courses entity. The Stud_ID that appears in the Student_Courses entity is a foreign key referencing the Students entity.

Figure 2.10 Step 5 of the modeling process assigns foreign keys to enforce referential integrity rules. The last step is to assign attributes, which are things that describe the entities. The attributes always include the primary key and any foreign keys. Sometimes, there is room to assign attributes on the same drawing as the relationship diagram, but typically, it is done on a separate piece of paper using a matrix as shown in Figure 2.11.

Figure 2.11 Step 6 of the modeling process assigns attributes to each of the entities. With the output of the entity-relationship modeling process, you have the information that you need to create a physical database design from the logical one (as represented by the final model). Entities become tables, attributes become columns, and primary and foreign keys are identified. However, you

should first validate your database design by normalizing it. This process is described in the next section.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Database Normalization
There is a lot of overlap between the entity-relationship modeling process discussed in the last section and the database normalization process discussed in this section. It is quite conceivable that you can properly design a database using either approach alone. However, I dont recommend it because each approach has advantages that the other doesnt. The entity-relationship modeling process is particularly well suited for identifying database integrity constraints, although it offers little help to ensure that attributes are assigned to the proper entities. Database normalization seeks to do just thisverify attribute assignment by the application of standardized rules. We do this by comparing our design to a series of normal forms. We say that our database is fully normalized when it is in third normal form. Each normal form seeks to eliminate data redundancy and to insert, update, and delete anomalies. In terms of database operations, an anomaly is an action that either produces an undesirable effect or prevents us from performing a desired action in the database. I discuss these anomalies throughout this section. For purposes of illustration, I use a simplified order-entry system. The order-entry system has the following pieces of information to be stored in the database: ORDERS Ord_No Ord_Date Cust_No Cust_Name Cust_Address Cust_City Cust_State

Cust_Postal_Code Item_No Item_Desc Item_Qty Item_Price Further, I make the assumptions that a customer can place any number of orders, that any given order has only one customer, and that an order can be placed for any number of items. (In other words, a customer can call and purchase a number of different items all on the same order.) However, any one item can appear on the order only once. To begin, we need to assign a primary key to our present design. We will call the primary key Ord_No.

First Normal Form

First normal form, sometimes abbreviated as FNF or 1NF, states that there can be no repeating groups of data. A repeating group of data is an item that can appear more than once in a row of data. For instance, because an order can be placed for multiple items, the columns Item_No, Item_Desc, and so on form a repeating group. The insert and update anomalies here are intuitively obvious: With our present design, there is no way for a customer to order more than one item. To resolve a first normal form violation, you must move the repeating group to a new table. The primary key of the new table is a combination of the primary key of the first table (in this case, Ord_No) and a key that uniquely identifies the repeating group (in this case, Item_No). Our revised design looks like the following: ORDERS Ord_No (PK) Ord_Date Cust_No Cust_Name Cust_Address Cust_City Cust_State Cust_Postal_Code ORDER_ITEMS Ord_No (PK) Item_No (PK) Item_Desc Item_Qty Item_Price

Unfortunately, we cannot stop here because we have some anomalies. First, we have an update anomaly in that we cannot add an item to the database unless there is an order for that item (because the item number and item description are not recorded until there is a row in the ORDER_ITEMS table). Second is an update anomaly: If an items description changes, it must be updated on every single order for that item. Third is a delete anomaly: If an order is deleted, the items description is deleted as well. Therefore, we need to move on to second normal form.

Second Normal Form

Second normal form, usually abbreviated as SNF or 2NF, states that we can have no partial-key dependencies. A partial-key dependency is a column that is dependent on only part of the primary key. To determine whether we have any second normal form violations, we examine all tables with compound keys (in this case, the ORDER_ITEMS table). An examination of the table shows that Item_Desc is dependent on Item_No in the primary key but not on Ord_No. In other words, an items description changes if the Item_No is different but not if the Ord_No is different. To resolve this problem, we move the partial-key dependencies to another table whose primary key is the partial key from the original table (in this case, Item_No). Our new design looks like the following: ORDERS Ord_No (PK) Ord_Date Cust_No Cust_Name Cust_Address Cust_City Cust_State Cust_Postal_Code ORDER_ITEMS Ord_No (PK) Item_No (PK) Item_Qty Item_Price ITEMS Item_No (PK) Item_Desc

This resolves our insert anomaly from the first normal form because we can now insert a new item into the database even if there are no orders for it. It also resolves our update anomaly because we now have only one row to update if an items description changes. Finally, we have resolved our delete anomaly because deleting an order no longer deletes the associated items. However, we still have anomalies. We have an insert anomaly in that we cant add a customer to the database unless that customer places an order. We have an update anomaly in that if we need to change the customers name or city, we have to update every order that customer has placed. We have a delete anomaly in that if we delete an order, we also delete the customer. To resolve this, we move on to third normal form.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Third Normal Form

Third normal form, usually abbreviated TNF or 3NF, states that there can be no non-key (or intra-data) dependencies. In other words, every column must be dependent on the primary key and not on another column that is not the primary key. In our example, customer name, address, city, and postal code are all dependent on the customer number and not the order number. To resolve this, we have to move the non-key dependencies to a new table whose primary key is that non-key column on which the columns were dependent (in other words, the Cust_No column). We also leave that non-key column in the original table. Our new design now looks like the following: ORDERS Ord_No (PK) Ord_Date Cust_No ORDER_ITEMS Ord_No (PK) Item_No (PK) Item_Qty Item_Price ITEMS Item_No (PK) Item_Desc CUSTOMERS Cust_No (PK) Cust_Name Cust_Address Cust_City Cust_State Cust_Postal_Code

We are now in third normal form. Normally, you can stop at third normal form and conclude that your database is fully normalized. There are actually other normal forms: Boyce-Code normal form, fourth normal form, and fifth normal form. These normal forms are used for rare circumstances, such as resolving circular references, and nearly always indicate a totally inappropriate data model. In my 15 years of working with relational databases, I have never needed to consider anything beyond third normal form. Sometimes, you need to step back and examine the design with a critical eye toward performance. If you see that you are going to end up having to join too many tables, you

might want to consider selectively denormalizing back to second or first normal form. My recommendation is to build the database in third normal form, load it with some sample data, and then run some benchmark tests against it to see if you have any performance bottlenecks. If you do, examine those bottlenecks and decide if they warrant denormalization.

Database Design Summary

Database design is part science, part art, and part experience. I provided you with two tools that, although either will work alone, I urge you to use in tandem to double-check your design. A lot of the design work is simply common sense, and you may well find that as you put together a database design, it is already in first, second, or third normal form. However, go through the steps of both entity-relationship modeling and database normalization anyway to confirm your design assumptions. Then, compare the design to the output of your system, the reports and screens. Make sure that you have identified the correct data elements and that you have not introduced a performance nightmare. Regarding the former, if you find that you have missed data elements that need to be captured and stored, add them into the design and repeat the entity-relationship modeling and the database normalization processes. If you find that you need to join four or five tables frequently, especially in the online environment, consider denormalizing your database. Once the database design is in place, it is time to create the database, which is the subject of the next section.

Data Definition Language

Data Definition Language (DDL) is one of three parts of SQL. I discuss Data Control Language (DCL) later in this chapter and Data Manipulation Language (DML) in the next chapter. DDL is the language we use to create and modify objects in the database. Objects are tables, indexes, keys, and so on. Before proceeding, it is important to understand the concept of ownership. When you log in to the database, you have a user ID. If you create a table, your user ID is considered the owner of that table. Every objects name consists of two partsthe owner name followed by the table name: Mike.EMPLOYEE In this example, Mike is the user ID that created the table EMPLOYEE. If you issue any kind of SQL statement against an object such as a table and omit the owner name, the database assumes your ID is the owner. Thus, if you perform a SELECT against the EMPLOYEE table without prefixing it with the owner, and your ID is Jane, SQL translates your statement to: SELECT... FROM Jane.EMPLOYEE If there is no Jane.EMPLOYEE or if you do not have permission to access the table, SQL returns an error message similar to object not found.

DDL consists of three statements: CREATE, ALTER, and DROP. For purposes of this discussion, the next three sections deal with using these statements to work with tables. Following that, I discuss other database objects such as indexes and keys.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

CREATE TABLE
The CREATE TABLE statement has the syntax: CREATE TABLE TABLE_NAME (col_name DATATYPE [NOT NULL], col_name ...) Following CREATE TABLE, specify the name of the table. Inside parentheses, list the columns separated by commas. After each column name, list the data type and, optionally, NOT NULL if the column is not allowed to have a null value. The syntax to create the EMPLOYEE table that I use throughout the remainder of this book is as follows: CREATE TABLE EMPLOYEE (Emp_No SMALLINT NOT NULL, Emp_LName VARCHAR (21) NOT NULL, Emp_FName VARCHAR (15), Emp_SSN CHAR (9), Emp_DOB DATE, Emp_Hire_Date DATE NOT NULL, Emp_Term_Date DATE, Emp_Salary NUMERIC (9,2), Emp_Dept_No SMALLINT, Emp_Mgr_ID SMALLINT, Emp_Gender CHAR (1), Emp_Health_Ins CHAR (1), Emp_Dental_Ins CHAR (1), Emp_Comments VARCHAR (255) )

TIP About The Tables Throughout this book, I use a series of tables, including the EMPLOYEE table here. Appendix A contains the syntax to create all of these tables along with primary keys, foreign keys, and so on. Because there are small differences between different databases, the CD-ROM contains these statements in RDBMS-specific directories. For instance, the Oracle CREATE statements are in the directory \SAMPDATA\ORACLE. If your database is not included, it is highly likely that one of the supplied syntax variations will work just fine. Otherwise, the modifications should be simple. Additionally, the CD-ROM includes the INSERT statements needed to create the data.

You can optionally specify various table constraints such as primary keys at the same time you create the table. The following syntax is for Sybase SQL Anywhere, but if you change the line Emp_Salary NUMERIC (9,2) to read Emp_Salary NUMBER (9,2), it will work with Oracle as well: CREATE TABLE EMPLOYEE (Emp_No SMALLINT NOT NULL, Emp_LName VARCHAR (21) NOT NUll, Emp_FName VARCHAR (15), Emp_SSN CHAR (9), Emp_DOB DATE, Emp_Hire_Date DATE NOT NULL, Emp_Term_Date DATE, Emp_Salary NUMERIC (9,2), Emp_Dept_No SMALLINT, Emp_Mgr_ID SMALLINT, Emp_Gender CHAR (1), Emp_Health_Ins CHAR (1), Emp_Dental_Ins CHAR (1), Emp_Comments VARCHAR (255), CONSTRAINT Pk_Emp_Id PRIMARY KEY (Emp_No), CONSTRAINT Fk_Emp_Dept FOREIGN KEY (Emp_Dept_No) REFERENCES DEPARTMENT (Dept_No) ) This syntax presupposes that the table DEPARTMENT already exists. If it doesnt, you need to create that table first. When you create tables and add constraints with the same code, figuring out which table needs to be built first can get complicated. I recommend that you create all your tables first and then add the constraints later with ALTER statements.

ALTER TABLE
The ALTER TABLE command changes a tables structure in some way. Exactly what you can change and when you can change it is somewhat dependent on the exact RDBMS. However, you can generally add a column at any time using the following syntax: ALTER TABLE EMPLOYEE ADD Emp_Maiden CHAR (21) NOT NULL

This example adds a column named Emp_Maiden with a data type of CHAR (21). It is specified NOT NULL. The column is always added to the end of the table. You cannot, unfortunately, delete a column. You can add constraints to a table using syntax similar to the following: ALTER TABLE EMPLOYEE ADD CONSTRAINT Pk_Emp_Id PRIMARY KEY (Emp_No) ALTER TABLE EMPLOYEE ADD CONSTRAINT Fk_Emp_Dept FOREIGN KEY (Emp_Dept_No) REFERENCES DEPARTMENT (Dept_No) This example adds a primary key to the EMPLOYEE table and a foreign key that references the Dept_No column in the DEPARTMENT table. Note that many databases assume the primary key when you dont specify the column name on the parent table. In other words, the following statement causes SQL to assume that the primary key of the DEPARTMENT table is being referenced: ALTER TABLE EMPLOYEE ADD CONSTRAINT Fk_Emp_Dept FOREIGN KEY (Emp_Dept_No) REFERENCES DEPARTMENT Some databases allow you to modify the definition of an existing column as long as the currently stored data in that column does not violate the new column definition. For instance, you can specify a column as NOT NULL even if it was not originally defined that way, as long as there are no null values already stored in the column: ALTER TABLE EMPLOYEE MODIFY Emp_Gender NOT NULL

DROP TABLE
The DROP TABLE command deletes a table and all of its data from the database. Also dropped are all integrity constraints and permissions (I discuss permissions under Data Control Languages later in this chapter). The syntax is: DROP TABLE EMPLOYEE

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Indexes
The database allows you to add indexes on one or more columns of any table. The indexes may be unique, meaning that the index does not permit duplicate, or non-unique, values. You can also use a unique index as a constraint to restrict the values in a column on a table. For instance, you might want to add a unique index on the Social Security number of an employee table. The syntax to create an index is as follows: CREATE [UNIQUE] INDEX INDEX_NAME ON TABLE_NAME (col_name, col_name ) The TABLE_NAME must be any valid table name. The col_name must be a valid column on that table. If more than one is listed, the column names must be comma-delimited. The following examples add a unique index on the Emp_SSN column and a non-unique index on the Emp_LName and Emp_FName columns: CREATE UNIQUE INDEX IDX_SSN ON EMPLOYEE (Emp_SSN) CREATE INDEX IDX_NAME ON EMPLOYEE (Emp_LName, Emp_FName) Most databases allow you to create one clustered index per table. With a clustered index, the rows are arranged in the same order as the index, which speeds up retrieval. Because the syntax for creating clustered indexes varies greatly from database to database, check your RDBMS documentation. You will want to use indexes judiciously. They can speed up database operations tremendously when you often access data in a certain manner. For instance, if you need to list employees in last name, first name order, an index tremendously speeds up retrieval time. On the other hand, indexes need to be maintained. If you update a table on the database, you must update all the indexes on that table as well. Thus, you want to

only use indexes where they are needed.

TIP Indexes And The Query Optimizer All of the RDBMSs have a built-in query optimizer. When you issue a query against the database, the optimizer attempts to determine the most efficient way of fulfilling the request. If indexes are available, it attempts to use them. However, if a column has only three or four discrete values (for example, the Emp_Gender column has only two values: F and M) it is actually more expensive to use the index than to ignore it. Fortunately, most modern query optimizers usually see this and dont use the index. Even so, you should consider the nature of the data in the columns that you are thinking about indexing.

Integrity Constraints
In the previous discussion of CREATE TABLE and ALTER TABLE, I presented some sample syntax on creating primary key and foreign key constraints. Different RDBMSs have some variance in how to create these constraints, and most allow you to create constraints in a variety of different manners. The syntax that I presented is fairly generic and works on most RDBMSs. For other options, consult your RDBMS documentation. A couple of notes are in order. Consider the following statements, which add three constraints to the EMPLOYEE table: ALTER TABLE EMPLOYEE ADD CONSTRAINT Pk_Emp_Id PRIMARY KEY (Emp_No) ALTER TABLE EMPLOYEE ADD CONSTRAINT Fk_Emp_Dept FOREIGN KEY (Emp_Dept_No) REFERENCES DEPARTMENT (Dept_No) ALTER TABLE EMPLOYEE ADD CONSTRAINT Ch_Gender CHECK (Emp_Gender IN ('M','F') The first example adds a primary key constraint to the table, forcing the Emp_No column to be a unique identifier for the table. The second adds a foreign key constraint, requiring that the value stored in Emp_Dept_No be a valid value in the Dept_No column on the DEPARTMENT table. Because the Emp_Dept_No column was not defined as NOT NULL, the column may also contain null values. (This is an example of a conditional foreign key constraint; it actually says that if the value is not null, it must be a valid value from the parent table.) The third example is an example of a check constraint, which is not supported by all RDBMSs. A check constraint restricts the values of the columns by providing an SQL function. In this case, it specifies that gender must be M or F. The constraint could also reference another column on the same row, as in the following example: ALTER TABLE EMPLOYEE ADD CONSTRAINT Ch_Hire_Date CHECK (Emp_Hire_Date > Emp_DOB) This example requires that hire date be greater than date of birth. You cannot use an SQL select in a check constraint. When a check constraint references another column on the same row, it is called a user integrity constraint. When it provides an absolute check

(such as being only M or F without regard for other columns on the row), it is called a domain integrity constraint. Generally, you do not need to provide a constraint name. If you do not, the database generates one. My recommendation is that you do provide a name so that any messages from the database are meaningful. For instance, if you attempt to set Emp_Gender = X, the database responds with a message similar to Update Failed - Integrity Constraint: CH_GENDER. If you elect not to name the constraint, you should also omit the CONSTRAINT keyword, as in the following example: ALTER TABLE EMPLOYEE ADD CHECK (Emp_Gender IN ('M','F'))

Views
A view is an SQL SELECT permanently stored on the database. A view can be convenient for your users because it helps them avoid typing complicated queries. It can also be used to show a user a portion of the table without revealing the entire underlying table. The following example creates a view on the EMPLOYEE table that allows users to see information such as name but not salary: CREATE VIEW EMPNOSAL AS SELECT Emp_No, Emp_LName, Emp_FName, Emp_Dept_No, Emp_Mgr_ID FROM EMPLOYEE To use the view, select from it as though it were a table: SELECT * FROM EMPNOSAL ORDER BY Emp_LName, Emp_FName A view is not a copy of the datajust a stored query. Also note that most databases allow you to update through a view, as in the following example: UPDATE EMPNOSAL SET Emp_Dept_No = 200 WHERE Emp_No = 100 In the example, the employees department number was updated (I discuss the syntax for the UPDATE command in Chapter 3). Because the view EMPNOSAL does not contain the Emp_Salary column, the user cant update salary information.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Triggers And Stored Procedures

A trigger is a small program that is stored on the database and defined to run automatically as a result of a database action, such as an insert or update of a certain table. The program consists of SQL statements plus RDBMS-specific extensions. For instance, a trigger written for an Oracle database is created using PL/SQL. A stored procedure is similar to a trigger except that it runs when called by a trigger, another stored procedure, or your program. The advantage of triggers and stored procedures is that they can move a good deal of business logic onto the server. For example, you might want the ZIP code to be validated every time an address is updated. This is an ideal candidate for a trigger that is tied to the address table and defined to run any time a row is updated or inserted. Implementing this rule as a trigger both reduces network traffic and relieves the client program of the burden of performing this edit. Because each RDBMS uses different SQL dialects, the syntax for triggers and stored procedures is highly variable. Nevertheless, later in the book I introduce some examples to show how triggers and stored procedures are used by your Visual Basic programs.

Data Control Language

Data Control Language (DCL) consists of two commands that control who has access to what objects on the database. By definition, the database administrator (DBA) has authorization to all objects. Beyond that, if you want to give another user access to an object that you created, you must specifically grant him or her permission.

The two commands used are GRANT and REVOKE. For a user to have access to the database, he or she must first be granted connect privileges: GRANT CONNECT TO user_ID IDENTIFIED BY password This syntax creates a new user and password. Assuming the user ID is Jane, to alter the password, use the following syntax: GRANT CONNECT TO Jane IDENTIFIED BY new_password To grant access to individual objects (such as a table), you have to grant a comma-delimited list of permissions to a comma-delimited list of users on one table at a time: GRANT SELECT, UPDATE, DELETE ON EMPLOYEE TO Tom, Dick, AND Harry To grant all permissions, use GRANT ALL PERMISSIONS. To grant permissions to all users, specify PUBLIC instead of a user list: GRANT SELECT ON DEPARTMENT TO PUBLIC You can selectively revoke permissions, but you cannot revoke access from yourself. The syntax to revoke privileges is similar to the following: REVOKE UPDATE, DELETE ON EMPLOYEE FROM Dick, Harry In this example, we have taken away update and delete privileges from users Dick and Harry. Their select privileges remain. Most databases allow you to create groups and assign users to groups. You can then grant permissions to and revoke permissions from the groups. This technique helps simplify administration.

Where To Go From Here

This chapter covered the theory of relational databases from their history through the theory of database design to actually implementing that design. From here, you will want to read the next chapter, An Introduction To SQL Data Manipulation Language. Data Manipulation Language (DML) is where you will spend most of your SQL time, so it is a critical concept to learn. Also, you might want to look at Appendix A, where I provide the syntax used to create the sample databases in this book as well as the syntax for the constraints. In Appendix B, I discuss the key differences between Jet SQL and ANSI SQL, which is a concern both for developers using Access databases and VB developers using Microsoft Jet in conjunction with DAO.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Chapter 3 An Introduction To SQL Data Manipulation Language

Key Topics: The definition and purpose of SQL Data Manipulation Language Scalar and aggregate functions Grouping data and the HAVING clause Joining tables, including self-joins SQL subqueries Performance considerations in tuning an SQL statement

In Chapter 2, I discussed the concept of the relational database, mentioning at

the same time that SQL is used to access and manipulate the database. I also discussed the theory of relational database design and showed you the SQL syntax used to implement that design. In this chapter, I guide you through the essential concepts of using Structured Query Language (SQL) to access and manipulate your database. I need to stress again that SQL is not a database nor does it necessarily imply the existence of a back-end relational database engine. All RDBMSs use SQL as an interface (to extract data or to create tables, for example), but other, non-relational data models such as dBase may offer SQL as an access language. Further, although a product such as Microsoft Access is a fully relational database, it is not a separate database engine; the application program is responsible both for creating the data request and for creating the

query result. (I expanded upon this in Chapter 2 when I defined the term RDBMS.)

What Is SQL?
The American National Standards Institute (ANSI) defines various computer industry standards. We are all familiar with the ANSI character set, for instance, which defines each character to be 1-byte long with the decimal value 65 representing the letter A. SQL-92 is an ANSI definition of what features must be supported in a full implementation of SQL. For example, the language needs to support the SELECT statement and the SUM function. If a vendors database product is in full compliance with these standards, it is ANSI-92 compliant. All of the major relational database products, such as Oracle and Microsoft SQL Server, are 100 percent ANSI-92 compliant. (Microsoft Access uses Jet SQL. Although Jet SQL is close, it is not ANSI-92 compliant. I discuss the differences in Appendix B.) Additionally, most vendors add extensions (additional features) implemented as what are commonly called dialects of SQL. Oracle uses PL/SQL, whereas Sybase and Microsoft use Transact-SQL (usually called T-SQL). Microsoft has been moving away from the Sybase model of T-SQL. Mostly, the dialects are close enough in syntax that moving from one to the other is relatively easy. When application programs access a relational database, they create SQL statements and send them to the RDBMS. The RDBMS processes the statements and sends the results back to the application program. Each RDBMS exposes its functionality via an Application Programming Interface (API). To shelter the developer from the intricacies of learning different APIs for each RDBMS, three categories of drivers were developed to interface with Visual Basic: VBSQLA Visual-Basic-specific set of drivers that allow native interface to Microsoft SQL Server. ODBC (Open Database Connectivity)An API that applications can address to further expose the underlying functionality of a wide variety of databases. The many variations on how this is done is the subject of much of this book. Appendix C overviews the implementation of ODBC. OLE DBA new, still evolving, API that exposes the underlying functionality of a wide variety of databases in an object-oriented manner. OLE DB also allows the joining of disparate data sources.

A Note About ODBC

ODBC is itself an API that application programs can use to gain access to a variety of database back ends using standard SQL. Visual Basic provides varying levels of intermediate drivers (such as Microsoft Jet) that shelter you from the need to write API calls. The back end might be an RDBMS, such as Oracle, or it might be an Indexed Sequential Access Method (ISAM) file, such as a FoxPro database. In theory, the developer can write code to access virtually any back-end database while being shielded from the complexities of the different access methods. This shielding is accomplished by providing a

single API, which communicates with ODBC drivers written for each of the different back-end databases. Figure 3.1 illustrates this communication process.

Figure 3.1 Client programs communicate with the ODBC API, which in turn communicates with the individual database-specific drivers. These drivers interact with the back-end databases. In reality, not all ODBC drivers are created equal. There are actually three levels of ODBC compliance: Level 0, sometimes called Minimum ODBC, is a set of minimal functionalities that must be supplied by an ODBC driver, including the ability to perform basic SQL SELECTs and so on. Level 1, sometimes called Core ODBC, is a more thorough implementation of SQL, including statements such as ALTER TABLE as well as more advanced SELECT functionality, such as the MAX and MIN functions. Level 2, or Extended ODBC, is an implementation of SQL that is at least ANSI-92 compliant, including outer joins, cursor-positioned UPDATEs, and so on. The exact definition of each of these compliance levels is discussed in Appendix C. It is not always easy to determine the level of conformance of a given ODBC driver, and performances of the drivers vary from vendor to vendor. Microsoft builds a variety of drivers for both Microsoft and non-Microsoft back-end database products. Intersolv also markets a wide variety of drivers that you can use in your programs. Finally, some vendors create drivers for their own database products; the most notable example is the very well implemented Level 2-compliant Sybase SQL Anywhere database. For technical information about ODBC standards and driver development, refer to Appendix C in this book, as well as to the ODBC 3.0 Programmers Reference And SDK Guide available from Microsoft Press.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Data Manipulation Language (DML)

Most client/server developers will spend the majority of their time using DMLthat portion of SQL used to manipulate the actual data stored in SQL tables. There are only four DML commands: INSERT, UPDATE, DELETE, and SELECT. Of these four statements, most developers will spend most of their time coding SQL SELECT statements. In the sections that follow, I discuss the use of each of these statements. In Chapter 2, I discussed Data Control Language (DCL) and Data Definition Language (DDL).

SELECT
The SELECT statement is used to retrieve rows from the database. The basic syntax of the SELECT statement is: SELECT [DISTINCT] column-list| * FROM table-lisT [WHERE condition1 AND condition2] [GROUP BY column-list [HAVING condition1 AND condition2]] [ORDER BY column-list] All clauses except FROM are optional. Use the SELECT clause to list those columns that are to appear in the result set. Each column name must be separated by a comma. To select all columns, you can use the asterisk instead of listing each column separately. Use the FROM clause to list the tables that contain the columns being referenced. For more than one table, separate the table names with commas. The following example selects three columns from the EMPLOYEE table:

SELECT EMP_ID, EMP_FNAME, EMP_LNAME FROM EMPLOYEE The next example selects all the columns from the EMPLOYEE table: SELECT * FROM EMPLOYEE

Derived Columns
SQL allows you to select columns that are not part of the database by creating derived columns. These columns might be string literals, system functions, or the results of operations on table columns.
TIP About The Examples Except where specifically noted, all of the examples in this chapter use the sample data provided on the CD-ROM. See Appendix A for details on loading this information into your database.

The following SELECT creates a derived column that is the result of multiplying EMP_SALARY by 1.08: SELECT EMP_SALARY * 1.08 FROM EMPLOYEE The next example creates a derived column that is actually a string literal: SELECT EMP_FNAME, EMP_LNAME, 'IS AN EMPLOYEE' FROM EMPLOYEE As shown in the following result set, the string literal is returned for every row: EMP_FNAME --------ANN BOB CAROL EMP_LNAME --------CALLAHAN JOHNSON DEMORA 'IS AN EMPLOYEE' ---------------IS AN EMPLOYEE IS AN EMPLOYEE IS AN EMPLOYEE

Some system functions are not truly related to any data at all. For instance, SYSDATE returns the current date and time from the database server. Even still, the SELECT statement must have a FROM clause. Some RDBMSs provide a dummy table from which to select such non-data-related functions: SELECT SYSDATE FROM DUAL This example is from Oracle, which provides a table named DUAL with a single row so that you can select against system-level functions. If you select SYSDATE from the EMPLOYEE table, and if the table has 100 rows, you get a result set of 100 lines of the current date and time.

If your database does not provide a dummy table, you can easily create one: CREATE TABLE DUMMY (DUM_COL CHAR(10) ) INSERT INTO DUMMY VALUES ('DUMMY') SELECT SYSDATE FROM DUMMY When the result set is displayed, the column headings default to the column names. You can rename a column heading using the AS keyword: SELECT EMP_NO AS 'EMPLOYEE NUMBER' FROM EMPLOYEE Some databases allow you to omit the AS keyword. If the new column name contains a space, you must surround the column name with double quotes. Some databases require that you use double quotes all the time (regardless of whether there is a space in the name).

The WHERE Clause

The WHERE clause is used to restrict the rows returned in the result set by specifying search conditions. For specifying more than one search condition, use the AND keyword. The following SELECT returns all rows in the EMPLOYEE table where gender is female and where salary is greater than 50,000: SELECT * FROM EMPLOYEE WHERE EMP_GENDER = 'F' AND EMP_SALARY > 50000 The WHERE clause allows Boolean logic to determine which rows are to be returned. NOT negates the search condition: WHERE NOT EMP_GENDER = 'F' OR allows for the selection of rows based upon multiple criteria: WHERE EMP_GENDER = 'F' OR EMP_SALARY > 50000 Use NOT to negate a search condition: WHERE NOT (EMP_SALARY > 50000 AND EMP_GENDER = 'F') Use parentheses to perform algebraic evaluations of search conditions. The following WHERE clause specifies that the result set should include females who make more than $50,000 as well as those employees who make more than

$75,000 (regardless of gender): WHERE (EMP_GENDER = 'F' AND EMP_SALARY > 50000) OR EMP_SALARY > 75000 You may use a scalar function in a WHERE clause but not an aggregate function. I discuss SQL functions later in this chapter, but for now, understand that a scalar function is one that operates on a single value at a time (such as returning the length of a last name), whereas an aggregate function operates on a range of values (such as returning the sum of all salaries). The following example returns rows where the length of the last name is greater than five: WHERE LENGTH(EMP_LNAME) > 5 A column used in the WHERE clause does not have to appear in the SELECT statement, as shown in the following example: SELECT EMP_ID, EMP_FNAME, EMP_LNAME FROM EMPLOYEE WHERE EMP_GENDER = 'F' You can perform wild-card searches using the LIKE keyword. To do so, use the underscore (_) character to represent a single character and the percent sign (%) to represent any number of characters. The following statement searches for all last names beginning with the letter B: WHERE EMP_LNAME LIKE 'B%'

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

The next example locates all those rows where the last names have the letter B in the second position and the letter D in the fourth position: WHERE EMP_LNAME LIKE '_B_D%' The WHERE clause also supports the use of the BETWEEN and IN modifiers. With BETWEEN, you can specify an inclusive range of values as in the following example, which returns rows where the salary is from $50,000 to $60,000: WHERE EMP_SALARY BETWEEN 50000 AND 60000 The IN keyword allows you to search a list of discrete values. You separate each value with commas and place the entire list inside parentheses. If the values are strings, they must be in single quotes: WHERE EMP_LNAME IN ('BROWN', 'SMITH', 'JOHNSON')
TIP About Databases And Case Sensitivity When dealing with character data (strings), the database is case sensitive just as Visual Basic is case sensitive when comparing two strings. Later in this chapter, I will introduce case conversion functions to help deal with this problem. Though the SQL code samples in this book have all keywords in uppercase, databases are generally not case sensitive with column names unless you surround the column names with double quotes.

Testing for the existence of null values requires special functions. Because any operation involving null is automatically null (I discuss this fully later in this chapter), SQL provides the IS NULL and IS NOT NULL functions. The following statement locates rows where YEARS is null:

WHERE YEARS IS NULL

Joining Tables
When you select data from more than one table, you need to tell SQL how the tables are related. You do this with the WHERE clause. Consider the DEPARTMENT and EMPLOYEE tables (see Appendix A for how the tables are defined). If I want to produce a list of departments and all the employees in each of those departments, I need to relate the EMPLOYEE table to the DEPARTMENT table. Each row on the EMPLOYEE table has an EMP_DEPT_NO column, and the DEPARTMENT table has a DEPT_NO column. Therefore, given a row in the DEPARTMENT table, I want to list all rows in the EMPLOYEE table where DEPT_NO is equal to EMP_DEPT_NO. This forms a join condition: SELECT DEPT_NO, DEPT_NAME, EMP_NO, EMP_FNAME, EMP_LNAME FROM EMPLOYEE, DEPARTMENT WHERE DEPT_NO = EMP_DEPT_NO Although I did not do so, many developers use common column names between tables. Whereas I used EMP_DEPT_NO in the EMPLOYEE table to store the department number, others might have simply used DEPT_NO. Obviously then, the two tables both have a column with the same name (DEPT_NO). There is nothing wrong with this at all. However, if an SQL query references a column name that appears in more than one table in the query, you have to qualify the column name with the table name to avoid ambiguity: SELECT DEPARTMENT.DEPT_NO, DEPT_NAME, EMP_NO, EMP_FNAME, EMP_LNAME FROM EMPLOYEE, DEPARTMENT WHERE EMPLOYEE.DEPT_NO = DEPARTMENT.DEPT_NO This example shows how to qualify the column names where ambiguity exists. The example assumes, of course, that the two column names are identical.

Table Aliasing
Typing in all those table names can be tiring, especially when qualifying column names. SQL allows you to alias table names in the FROM clause: FROM EMPLOYEE EMP, DEPARTMENT DEP Once you have aliased a column, you use the alias instead of the table name elsewhere in the query: SELECT EMP.EMP_NO, DEP.DEPT_NO Aliasing also has implications in correlated subqueries and self-joins, both of which I discuss later in this chapter. Aliased names are often referred to as correlation names, particularly when dealing with subqueries.

You can join together more than two tables. If in the previous example I want to add the location name (from the LOCATION table), I have to join that table to one of the other two tables. The DEPARTMENT table has a column DEPT_LOC_ID that corresponds to the LOC_ID column in the LOCATION table. Therefore, my query might look like this: SELECT DEPT_NO, DEPT_NAME, LOC_NAME, EMP_NO, EMP_FNAME, EMP_LNAME FROM EMPLOYEE, DEPARTMENT, LOCATION WHERE DEPT_NO = EMP_DEPT_NO AND DEPT_LOC_ID = LOC_ID Sample output based on the above query is shown here: DEPT_NO ------100 100 100 100 DEPT_NAME --------ACCOUNTING ACCOUNTING ACCOUNTING ACCOUNTING LOC_NAME -------NORTHEAST NORTHEAST NORTHEAST NORTHEAST EMP_NO -----133 101 105 113 EMP_FNAME --------MAUREEN ANN EUNICE MICHAEL EMP_LNAME --------PODANSKI CALLAHAN BROWN ANDERSON

Self-Joins
A confusing concept for many developers is that you can join a table to itself. Consider the EMPLOYEE table: One of the columns is EMP_MGR_ID, which represents the manager of the employee. If you select the contents of the EMPLOYEE table, you will see that employee number 126s (David Madison) manager is employee 110 (John Smith). Assume I want a list of all employees and the names of their managers. In Figure 3.2, you can see where I need to join a column from one row (EMP_MGR_ID) with a column in another row (EMP_NO). Essentially, I need to pretend that there are two copies of the EMPLOYEE table.

Figure 3.2 To find the name of David Madisons manager, I need to reference the EMP_MGR_ID column on Davids row and use it to find John Smiths row. The query that I put together uses two aliases for the EMPLOYEE table. I use the first alias (EMP) to find the employee information and the second (MGR) to find the manager information: SELECT EMP.EMP_NO AS 'EMP NO', EMP.EMP_FNAME AS 'EMP FIRST', EMP.EMP_LNAME AS 'EMP LAST', MGR.EMP_NO AS 'MGR ID', MGR.EMP_FNAME AS 'MGR FIRST', MGR.EMP_LNAME AS 'MGR LAST' FROM EMPLOYEE EMP, EMPLOYEE MGR WHERE EMP.EMP_MGR_ID = MGR.EMP_NO

ORDER BY EMP.EMP_LNAME, EMP.EMP_FNAME

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

The join is on the EMP_NO column from the MGR copy of the EMPLOYEE table to the EMP_MGR_ID column of the EMP copy of the table. Sample output based on the query is presented here. Note that Ann Callahan has no manager: EMP NO -----111 113 106 105 101 EMP FIRST --------KATHRYN MICHAEL FRANK EUNICE ANN EMP LAST -------AMES ANDERSON BENSON BROWN CALLAHAN MGR ID -----119 101 110 101 101 MGR FIRST --------URSULA ANN JOHN ANN ANN MGR LAST -------SMITHSONIAN CALLAHAN SMITH CALLAHAN CALLAHAN

Self-joins are not terribly difficult once you get used to the idea of joining a table to itself; they can often be useful to find information in one row that is related to information in another row in the same table.

The ORDER BY Clause

You can ask the database to sort your result set by using the ORDER BY clause. Specify the column or columns by which you want to sort your result set. If ordering by more than one column, separate the column names with commas. The default sort sequence is ascending. If you want to sort in descending order, specify the DESC option. The following query lists employees and their salaries sorted by department number. Within each department number, the employees are sorted by salary in descending order: SELECT EMP_DEPT_NO, EMP_FNAME, EMP_LNAME, EMP_SALARY FROM EMPLOYEE ORDER BY EMP_DEPT_NO, EMP_SALARY DESC Sample output based on the query is presented here:

EMP_DEPT -------100 100 100 100 100 100 100 100 200 200 200

EMP_FNAME --------ANN MICHAEL ROSE EUNICE CHARLES WINIFRED MAUREEN GEORGE DAVID HENRIETTA BRIAN

EMP_LNAME --------CALLAHAN ANDERSON DANIELS BROWN GERRFON VANCE PODANSKI DE NORVA MADISON KEOUGH JOHNSON

EMP_SALARY ---------127500.00 72563.34 67572.42 61426.08 53496.87 37712.10 34591.90 21313.97 77492.47 72789.20 72750.62

The columns that you order by do not have to appear in the SELECT list, although it makes no sense not to include them. (How would the reader know that data was sorted by department if you did not display the department?) You can order positionally by specifying the column number instead of the column name, as shown in this example: ORDER BY 1, 4 DESC However, it is likely that the next ANSI SQL standard will remove this option as part of the SQL definition. You may, therefore, be better off ordering by the column names themselves. If you have renamed the columns (see the discussion of SELECT earlier in this chapter), you can order by the new column name as shown: SELECT EMP_FNAME AS 'FIRST', EMP_LNAME AS 'LAST' FROM EMPLOYEE ORDER BY 'LAST', 'FIRST' Under certain circumstances, you must order by the new column names. An example is performing a UNION (which I discuss next).

UNION And UNION ALL

The UNION statement allows you to combine the results of two queries. The only restrictions are that the queries must include the same number of columns and that the columns must be the same general data type. (For example, it is okay that one is a CHAR and the other is a VARCHAR.) You can combine as many queries as you want. Whereas UNION ALL combines all rows, UNION sorts the result set and eliminates any duplicate rows. Ordering a UNION query can be tricky because the column names from one query to another may be different (as in the following example). Positional ordering is most convenient. Some databases require that you order by the column names of the first query (EMP_NO and EMP_LNAME in the following example) or that you rename the columns and order by those new names. Check your RDBMS documentation or just try the query and experiment with the ORDER BY clause:

SELECT EMP_NO AS 'NUMBER', EMP_LNAME AS 'NAME' FROM EMPLOYEE UNION SELECT DEPT_NO, DEPT_NAME FROM DEPARTMENT ORDER BY 2, 1

SQL Functions
SQL provides a number of functions to manipulate information from the database. Functions fall into two categories: Scalar functions act on one item at a time and aggregate functions act on a range of values at a time. Deciding whether a function is scalar or aggregate is more than an academic exercise. Consider the WHERE and HAVING clauses (I discuss HAVING later in this chapter): WHERE clauseYou can use a scalar function as a search condition in a WHERE clause. For instance, WHERE SQRT (EMP_SALARY) > 5000 (SQRT returns the square root of an expression) is valid, but WHERE SUM (EMP_SALARY) > 250000 is not. HAVING clauseYou can use aggregate functions in a HAVING clause to restrict a result set. For instance, HAVING SUM (SALARY) > 250000 is valid, but HAVING SQRT (EMP_SALARY) > 5000 is not. Whether a function is scalar or aggregate also has an impact on handling of null values.

Null Handling
A value of NULL stored in a database creates special problems. Consider the following: 100 + NULL = ? Nonintuitively, the answer to this equation is NULL. Any operation performed on NULL is automatically NULL. Assume that you want to give every employee a $100 a week raise, and you want to see what the new weekly salary for each employee would be. You might code the following query: SELECT EMP_FNAME, EMP_LNAME, EMP_SALARY/52 + 100 FROM EMPLOYEE The SELECT works fine except for those employees whose salary happens to be NULL. The result for those employees is still NULL. Different databases handle this problem with various functions. With Oracle, you can use the NVL function: SELECT EMP_FNAME, EMP_LNAME, NVL(EMP_SALARY, 0)/52 + 100 FROM EMPLOYEE In this case, the NVL function specifies that if EMP_SALARY is NULL, then the value 0 should be substituted. Some other RDBMSs implement the similar function ISNULL:

SELECT EMP_FNAME, EMP_LNAME, ISNULL(EMP_SALARY, 0, EMP_SALARY)/52 + 100 FROM EMPLOYEE

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

In this case, the ISNULL function says that if EMP_SALARY is equal to NULL, the value 0 should be substituted. If it is not equal to NULL, the EMP_SALARY value should be used. (The ISNULL function is similar to the IF function in most spreadsheets.)

Selected SQL Functions

SQL functions are one of those areas with a fairly high amount of variability between RDBMSs. Some functions, such as SUM, are common to all, whereas others may have no equivalent in another RDBMS. For instance, Oracle has a flexible TO_CHAR function to convert a noncharacter data type (such as DATE or NUMBER) to a character type and optionally format the output. Sybase uses either the CAST or CONVERT function to accomplish the same thing. In general, if you are comfortable with using functions in one RDBMS, you will have little problem moving to another RDBMS. In the following sections, I discuss some of the functions you are most likely to use on a regular basis.

DISTINCT
Arguably, DISTINCT is not as much a function as it is a SELECT statement qualifier. However, because it operates on a single value, I include it here. Its purpose is to return only unique values for a row: SELECT DISTINCT (EMP_DEPT_NO) FROM EMPLOYEE The result is: EMP_DEPT_NO ----------100

200 300 400 Most databases do not require that the column name be surrounded by parentheses when using DISTINCT (however, it does not hurt to use them).

COUNT
COUNT returns a count of all rows that meet a condition. For instance, COUNT (EMP_DEPT_NO) returns 32 because there are 32 values stored in that column. However, COUNT (DISTINCT (EMP_DEPT_NO)) returns 4 because there are only 4 unique values. Often, COUNT (*) is used to find the number of rows on a table: SELECT COUNT(*) FROM EMPLOYEE Also, you can use COUNT(*) in conjunction with the GROUP BY clause to determine counts in categories (I discuss GROUP BY later in this chapter): SELECT EMP_GENDER AS 'GENDER', COUNT(*) AS 'COUNT' FROM EMPLOYEE GROUP BY EMP_GENDER The following result set is returned: GENDER -----F M COUNT ----18 14

SUM
SUM adds values in a column. To obtain a total of the salaries from the EMPLOYEE table, run the following query: SELECT SUM(EMP_SALARY) FROM EMPLOYEE The following is the result: SUM(EMP_SALARY) --------------1861241.06

MIN, MAX, And AVG

MIN calculates the minimum value in a column. Similarly, MAX calculates the maximum value, and AVG calculates the average value: SELECT MIN(EMP_SALARY), MAX(EMP_SALARY), AVG(EMP_SALARY)

FROM EMPLOYEE The result is: MIN(EMP_SALARY) --------------18331.50 MAX(EMP_SALARY) --------------127500 AVG(EMP_SALARY) --------------58163.78

LENGTH
As with the Visual Basic LEN function, LENGTH returns the length of a column or expression. LENGTH (Smith) returns 5.

SOUNDEX
SOUNDEX searches for strings based upon their similarity (in sound) to the supplied argument. Although this should be a powerful function, its usefulness can be iffy. RDBMSs tend to have different levels of tolerance for what two words sound alike and may need some tuning. The following query and result comes from Sybase SQL Anywhere. Refer to your RDBMSs documentation regarding the use of SOUNDEX. SELECT EMP_NO, EMP_FNAME, EMP_LNAME FROM EMPLOYEE WHERE SOUNDEX('SMYTHE') = SOUNDEX(EMP_LNAME) The result of the query is: EMP_NO -----110 118 103 EMP_FNAME --------JOHN STEVEN CAROL EMP_LNAME --------SMITH SMITH SMITH

SUBSTR, LEFT, And RIGHT

The functions SUBSTR, LEFT, and RIGHT each return a portion of a string from character data types. You use SUBSTR to return a specific number of characters beginning anywhere within the string: SELECT SUBSTR (EMP_FNAME, 1, 4), SUBSTR (LOC_NAME, 3, 8), SUBSTR (DEPT_NAME, 2) In this example, the first SUBSTR is returning 4 characters of EMP_FNAME beginning with position 1. The second SUBSTR is returning 8 characters from LOC_NAME beginning with position 3. In the last SUBSTR, the number of characters to return has been omitted. In this case, SQL simply returns all remaining characters (in this case, beginning with position 2 of DEPT_NAME). The LEFT and RIGHT functions are similar in operation. LEFT returns the

leftmost n characters from a column, whereas RIGHT returns the rightmost n characters from a column: SELECT LEFT (EMP_LNAME, 1), RIGHT (EMP_LNAME, 4) Not all databases support the RIGHT and LEFT functions, but if not, the SUBSTR function is adequate.

Concatenation
Some databases provide specific functions to concatenate two strings. For instance, Oracle offers the CONCAT function, which accepts two arguments (two strings to be concatenated). Almost all RDBMSs support the concatenation operator, which is simply two pipe characters (the pipe character is usually above the backslash on your keyboard and looks like one vertical dash on top of another). The advantage of the concatenation operator is that you can concatenate as many strings as needed without resorting to nesting functions. The first example in the next function shows the concatenation operator in use.

TRIM, RTRIM, And LTRIM

The similar functions TRIM, RTRIM, and LTRIM take a string and trim any leading or trailing spaces from it. RTRIM is useful when you want to concatenate two fixed-length columns: SELECT RTRIM(EMP_FNAME) || ' ' || EMP_LNAME AS 'NAME' FROM EMPLOYEE WHERE EMP_NO < 110 ORDER BY EMP_LNAME The result is: NAME ---------------FRANK BENSON EUNICE BROWN ANN CALLAHAN BRIAN JOHNSON CAROL SMITH GENEVIEVE WESLEY Without trimming the rightmost characters from the first name, you get a result that looks like the following: NAME ---------------------FRANK BENSON EUNICE BROWN ANN CALLAHAN BRIAN JOHNSON CAROL SMITH

GENEVIEVE

WESLEY

The LTRIM function removes leading spaces from a string if it has any. Some RDBMSs also provide a TRIM function, which combines the functionality of LTRIM and RTRIM.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

ROUND And TRUNCATE

ROUND and TRUNCATE are similar but not identical. The first rounds a number to the specified position, whereas the latter merely truncates a number at a specified position. Observe the difference when working on EMP_SALARY/52 in the following example: SELECT EMP_NO AS 'EMP NO', EMP_SALARY/52 AS 'WEEKLY', ROUND(EMP_SALARY/52, 3) AS 'ROUNDED', TRUNCATE (EMP_SALARY/52,3) AS 'TRUNCATED' FROM EMPLOYEE WHERE EMP_NO BETWEEN 110 AND 115 ORDER BY EMP_NO The result follows. Note that the third and last rows have differences between the rounded and the truncated numbers. Any result ending in a 5 or above is rounded up. EMP NO WEEKLY ------ --------110 645.10212 111 1876.92308 112 696.18808 113 1395.44885 114 900.75038 115 678.30673 ROUNDED -------645.102 1876.923 696.188 1395.449 900.750 678.307 TRUNCATED --------645.102 1876.923 696.188 1395.448 900.750 678.306

MOD Or REMAINDER
The MOD function returns the remainder after dividing two numbers. MOD

(15, 2) returns 1 because 15 divided by 2 equals 7 with a remainder of 1. This can be a useful function when performing a lot of math at the database server. Some RDBMSs call this function REMAINDER, but the purpose and syntax is otherwise identical.

POWER
Similar to the Visual Basic exponentiation operator, POWER raises a number to the power of another number: POWER (2, 4) returns 16 (2 raised to the 4th power). Not all RDBMSs support this function. You can use fractional exponents to return the root of a number. POWER (8, 3) returns two, which is the third root of eight.

Date Functions
The different RDBMSs have a wide range in what functions they provide to support operations on dates. For instance, Oracle has a DAYS_AFTER function, which returns the date that falls n days after the supplied date: SELECT DAYS_AFTER ('02-FEB-99', 365) This function returns '02-Feb-00'. Sybase uses DAYS: SELECT DAYS ('02-FEB-99', 365) Check your RDBMS documentation for specific date functions.

GROUP BY
The SQL GROUP BY clause, which allows you to group data, is particularly useful when taking breaks on aggregate functions (such as showing subtotals). When a SELECT has a column with an aggregate function, you must GROUP BY all columns where there is no aggregate function. The following query performs a sum on EMP_SALARY and also selects EMP_DEPT_NO and EMP_GENDER. Therefore, you must group by the latter two columns, as shown in this example: SELECT EMP_DEPT_NO, EMP_GENDER, SUM(EMP_SALARY) FROM EMPLOYEE GROUP BY EMP_DEPT_NO, EMP_GENDER ORDER BY EMP_DEPT_NO, EMP_GENDER It sometimes confuses developers that they need to GROUP BY all non-aggregate function columns, but the example shows it only makes sense that if you are summing salary information, you need to take breaks (show subtotals) on all the other columns. The result follows: EMP_DEPT_NO ----------100 100 EMP_GENDER ---------F M SUM(EMP_SALARY) --------------328802.50 147374.18

200 200 300 300 400 400

F M F M F M

147115.66 311425.32 525056.55 77568.70 75759.54 248138.61

You can use multiple aggregate functions as shown in the following example, which compares minimum, maximum, and average salaries for males and females in each department and shows a count for each: SELECT EMP_DEPT_NO AS 'DEPT', EMP_GENDER AS 'SEX', COUNT(*) AS 'COUNT', MIN(EMP_SALARY) AS 'MIN SALARY', MAX(EMP_SALARY) AS 'MAX SALARY', ROUND(AVG(EMP_SALARY),2) AS 'AVG SALARY' FROM EMPLOYEE GROUP BY EMP_DEPT_NO, EMP_GENDER ORDER BY EMP_DEPT_NO, EMP_GENDER The result of the query follows: DEPT ---100 100 200 200 300 300 400 400 SEX --F M F M F M F M COUNT ----5 3 3 6 8 1 2 4 MIN SALARY ---------34591.90 21313.97 27487.44 33545.31 18331.50 77568.70 33543.13 36201.78 MAX SALARY ---------127500.00 72563.34 72789.20 77492.47 104800.00 77568.70 42216.41 73643.82 AVG SALARY ---------65760.50 49124.73 49038.55 51904.22 65632.07 77568.70 37879.77 62034.65

As shown in the result, females in this company tend to make less than their male counterparts in similar jobs except in Department 100. (That was entirely unintentional. All data in the supplied sample tables was generated randomly.) You can also GROUP BY columns that do not appear in the SELECT list. This option mostly has applications in subselects, which I discuss later in this chapter. Sometimes, however, this option is useful even without subselects but in conjunction with the HAVING clause, which I discuss next.

HAVING
The last clause in the SQL SELECT statement is the HAVING clause. The HAVING clause allows you to restrict groups of data based on the results of aggregate functions and thus is usually associated with the GROUP BY clause. You will most likely read in some SQL texts that HAVING must be used in conjunction with the GROUP BY clause, but that is not actually true. The following statement, although not very meaningful, executes just fine:

SELECT SUM(EMP_SALARY) FROM EMPLOYEE HAVING SUM(EMP_SALARY) > 1 The example says to select the sum of all salaries if that sum is greater than one. When used in conjunction with GROUP BY, the HAVING clause is more meaningful. I refine the earlier example of salaries by department and gender to show only those departments and gender combinations where the average salary is less than $50,000: SELECT EMP_DEPT_NO AS 'DEPT', EMP_GENDER AS 'SEX', COUNT(*) AS 'COUNT', MIN(EMP_SALARY) AS 'MIN SALARY', MAX(EMP_SALARY) AS 'MAX SALARY', ROUND(AVG(EMP_SALARY),2) AS 'AVG SALARY' FROM EMPLOYEE GROUP BY EMP_DEPT_NO, EMP_GENDER HAVING AVG(EMP_SALARY) < 50000 ORDER BY EMP_DEPT_NO, EMP_GENDER The new report follows: DEPT ---100 200 400 SEX --M F F COUNT ----3 3 2 MIN SALARY ---------21313.97 27487.44 33543.13 MAX SALARY ---------72563.34 72789.20 42216.41 AVG SALARY ---------49124.73 49038.55 37879.77

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Sometimes it is useful to combine the GROUP BY and HAVING clauses even where the aggregate function does not appear in the SELECT list: SELECT EMP_DEPT_NO FROM EMPLOYEE GROUP BY EMP_DEPT_NO HAVING MIN(EMP_SALARY) < 25000 This query returns two rows (Departments 100 and 300) where the minimum salary is less than $25,000. You can also use HAVING in conjunction with a GROUP BY column that does not appear in the SELECT list, as you will see in the next section.

Subqueries
A subquery can be considered a way to join the result of two queries. Suppose you want to find all employees who work in a department whose average salary is greater than $55,000. You really need to first know what departments have an average salary greater than $55,000: SELECT EMP_DEPT_NO, AVG(EMP_SALARY) FROM EMPLOYEE GROUP BY EMP_DEPT_NO As shown in the following result, you need to list those employees in Departments 100 and 300: EMP_DEPT_NO ----------100 200 EMP_SALARY ---------59522.08 50948.99

300 400

66958.36 53983.02

You can now run a second query: SELECT EMP_NO, EMP_DEPT_NO, EMP_FNAME, EMP_LNAME, EMP_SALARY FROM EMPLOYEE WHERE EMP_DEPT_NO IN (100, 300) The trick, however, is to ask both questions at once. To do this, you need a subquery: SELECT EMP_NO, EMP_DEPT_NO, EMP_FNAME, EMP_LNAME, EMP_SALARY FROM EMPLOYEE WHERE EMP_DEPT_NO IN (SELECT EMP_DEPT_NO FROM EMPLOYEE GROUP BY EMP_DEPT_NO HAVING AVG(EMP_SALARY) > 55000) The subquery returns a list of those departments where the departments average salary is greater than 55,000 and produces the following result set: EMP_NO -----101 105 113 117 119 121 123 125 127 129 131 133 135 103 107 111 115 EMP_DEPT_NO ----------100 100 100 100 300 100 300 100 300 100 300 100 300 300 300 300 300 EMP_FNAME --------ANN EUNICE MICHAEL ROSE URSULA WINIFRED ALAN CHARLES EUGENE GEORGE JOAN MAUREEN PATRICIA CAROL GENEVIEVE KATHRYN OLIVERA EMP_LNAME --------CALLAHAN BROWN ANDERSON DANIELS SMITHSONIAN VANCE LAWRENCE GERRFON FITZ DE NORVA WILLIAMS PODANSKI NORTON SMITH WESLEY AMES MAHARAMBA EMP_SALARY ---------127500.00 61426.08 72563.34 67572.42 104800.00 37712.10 77568.70 53496.87 58120.56 21313.97 74586.61 34591.90 61597.15 74748.78 18331.50 97600.00 35271.95

If I want to know which employee in each department has the highest salary, I need to use what is called a correlated subquery. A correlated subquery is one where a search condition in the inner query, specified in the WHERE clause, is correlated to the WHERE clause of the outer query.

Subqueries And Database Performance

A subquery does not significantly diminish performance because the inner query (the subquery) is performed only once. It returns its result set to the outer query, which then executes like any other query. However, when you run a correlated subquery, the inner query runs once for each occurrence of the outer query. For example, in the next example where I search for which employee makes the highest salary in each department, the outer query runs first, returning every row in the EMPLOYEE table. As each row is returned by the database, the inner query runs. In this case, the database checks whether the employee returned in the outer row has a salary matching the row returned by the inner query. Imagine that your company has 10,000 employees; the inner query then runs 10,000 times. Correlated subqueries can have a devastating impact on performance, so you should use them only when absolutely necessary. Often, you can rephrase the query to yield better performance. If you are forced into writing a correlated subquery, you might be able to rewrite it so that the outer query returns the most exclusive result set (the least number of rows). Although I generally recommend moving as much of the processing to the database server as possible, correlated subqueries are often better handled partially within your Visual Basic program. For instance, you can run the inner query from your VB application and have the database return just four rows with department number and salary: SELECT EMP_DEPT_NO, MAX(SALARY) FROM EMPLOYEE GROUP BY EMP_DEPT_NO Then, your VB program would dynamically construct another query, passing the department numbers and salaries as part of the WHERE clause. Assume you are running this correlated subquery in an organization with 10,000 employees and 50 departments. Further, assume that the inner query takes three seconds to run and that a SELECT of employees matching a specific department and salary condition takes one second to run. The correlated subquery might actually run for more than 10 hours before it completes the work. (Database-caching techniques and so on could actually improve this number tremendously, but expect to go to lunch, dinner, and then to the unemployment office.) Using the technique I suggest, the query would run in less than a minute (much less than a minute if the database is properly tuned). The following query finds the person making the highest salary in each department: SELECT EMP_DEPT_NO, EMP_FNAME, EMP_LNAME, EMP_SALARY FROM EMPLOYEE EMP WHERE EMP_SALARY IN (SELECT MAX(EMP_SALARY) FROM EMPLOYEE WHERE EMP_DEPT_NO = EMP.EMP_DEPT_NO GROUP BY EMP_DEPT_NO)

ORDER BY EMP_DEPT_NO The output of the query looks like the following: EMP_DEPT_NO ----------100 200 300 400 EMP_FNAME --------ANN DAVID URSULA PAUL EMP_LNAME --------CALLAHAN MADISON SMITHSONIAN JAMESSON EMP_SALARY ---------127500.00 77492.47 104800.00 73643.82

The WHERE clause in the outer query creates a correlated variable, EMP, allowing the match of EMP_DEPT_NO from the outer query to the same column from the inner query.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

You use the EXISTS keyword in subqueries to test for the existence of occurrences in an outer query. Suppose you want to find all departments that have one or more employees making less than $25,000 and show the number of those employees. You can run a query and test whether the results of the inner query exist in the outer query. The following example selects EMP_DEPT_NO from EMPLOYEE and then restricts the results to those rows where the department number is also returned by the inner query: SELECT EMP_DEPT_NO FROM EMPLOYEE EMP WHERE EXISTS (SELECT EMP_DEPT_NO FROM EMPLOYEE WHERE EMP.EMP_NO = EMP_NO AND EMP_SALARY < 30000 ) ORDER BY EMP_DEPT_NO The result is: EMP_DEPT_NO ----------100 200 300 You have a variety of other ways to join the results of an inner query to an outer query, and you can nest the queries as deeply as necessary. Two keywords that you will find in most databases are ANY and ALL. The ANY keyword allows you to compare a column from the outer query to any of the values returned by the inner query. Assuming you want to know all the male employees who have a female manager, you can code the query as follows:

SELECT EMP_NO, EMP_FNAME, EMP_LNAME FROM EMPLOYEE WHERE EMP_GENDER = 'M' AND EMP_MGR_ID = ANY (SELECT EMP_NO FROM EMPLOYEE WHERE EMP_GENDER = 'F') ORDER BY EMP_NO The results are: EMP_NO -----112 113 116 120 123 125 129 132 EMP_FNAME --------LARRY MICHAEL PAUL VINCENT ALAN CHARLES GEORGE KEVIN EMP_LNAME --------JOHANSSEN ANDERSON JAMESSON LINCOLN LAWRENCE GERRFON DE NORVA KERRIGAN

TIP A Few Words About COMMIT And ROLLBACK Chapter 11 delves into the concepts of transactions, sometimes called logical units of work (LUW). Every transaction ends with either a COMMIT or a ROLLBACK. COMMIT makes permanent all changes to the database in the current transaction. ROLLBACK puts the database back to where it was. In other words, it undoes all the changes made during the current transaction. In the following sections, I show you examples of commands that alter data. So that your results remain the same as mine, I highly urge you to create a backup of the database so that you can restore it to its original condition. Alternatively, you can issue a ROLLBACK command after each INSERT, DELETE, or UPDATE to undo all of the changes.

INSERT
It comes as no surprise that the INSERT statement is used to add new records to a table. The syntax of the command is: INSERT INTO tablename [(column-list)] VALUES (value1, value2, valuen) The tablename is an existing table on which the developer has insert authority. The column-list lists those columns into which data is to be inserted. If all columns are used, the column list is not necessary. The following statement adds a new row to the DEPARTMENT table. Because all columns are used, the column list is omitted: INSERT INTO DEPARTMENT

VALUES (500, 'NE', 'New Department') Notice that all character values must be enclosed in single quotes. If you use double quotes, SQL assumes you are referencing another column. If you are inserting a row where a column will be set to NULL, you must specify the NULL keyword: INSERT INTO DEPARTMENT VALUES (500, 'NE', NULL)

DELETE
The DELETE statement removes one or more rows from a table. The syntax is: DELETE FROM tablename [WHERE condition] The tablename is any valid table. If you omit the WHERE clause, all rows are deleted. The following statement deletes all employees whose department is 100: DELETE FROM EMPLOYEE WHERE EMP_DEPT_NO = 100

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

UPDATE
The UPDATE command updates one or more rows in a single table and takes the following syntax: UPDATE tablename SET column_name= value [, column_name= value] [WHERE condition] You can update any one or more columns, although most RDBMSs wont allow you to update the primary key columns. If you leave out the WHERE clause, all rows are updated. In the following code snippets, the first example gives all employees an 8 percent raise. The second command then gives all female employees in departments 200, 300, and 400 an additional 10 percent raise (obviously to make up for the inequities noted in the GROUP BY discussion earlier in this chapter): UPDATE EMPLOYEE SET EMP_SALARY = EMP_SALARY * 1.08 UPDATE EMPLOYEE SET EMP_SALARY = EMP_SALARY * 1.1 WHERE EMP_GENDER = 'F' AND EMP_DEPT_NO IN (200, 300, 400)

Where To Go From Here

In this chapter, I discussed SQL Data Manipulation Language, which is where the average developer will spend the vast majority of his or her time. Although the commands (INSERT, DELETE, and UPDATE) are simple enough, I

really only scratched the surface of what can be done with the SELECT statement. A point of confusion for any developer is where to perform processing. For instance, you can do a simple SELECT from the database and then sort the result set at the client using Visual Basic. Alternatively, you can use the ORDER BY clause to do the sorting at the database level. Although there is no hard and fast rule, common sense usually dictates the correct answer. In general, it is better to let the database perform the work unless doing so would generate more network traffic. The most obvious example of that is letting the database format numbers (adding thousand and decimal separators and currency symbols). That is probably best left to Visual Basic to add when it displays the data. I discuss these issues in future chapters. If you feel comfortable with SQL, particularly with the SELECT statement, then read on. If not, you might want to consult one of the many books on the subject. If you will be developing for more than one RDBMS, you might want to consider books that are not RDBMS-specific. Otherwise, you probably want to find a book that is specific to your database. If you are developing and Microsoft Jet will be involved, you might want to consult Appendix B, where I discuss some of the vagaries of that dialect of SQL and how it differs from ANSI SQL. If you are using the ODBC API, you will also want to refer to Appendix C. If you havent yet read Chapter 2, you might want to do so. I discuss the other two portions of SQL (DCL and DDL) there. For advanced topics, sneak a peek at Chapter 11. Otherwise, read the next chapter (Chapter 4) where I overview data access with Visual Basic. Finally, dont forget that the Internet provides a wealth of information. Specifically, each of the major database vendors has its own Web site, and a search of the knowledge base at Microsofts Web site using the terms database and SQL yields many tricks and tips.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Chapter 4 Visual Basic 6 Data Access

Key Topics: Evolution of Data Access with Visual Basic Overview of flat (sequential) file processing Overview of DAO, RDO, and ADO Overview of VBSQL Overview of the ODBC API Considerations in choosing a data model

The nature of nearly all business (and most nonbusiness) programs is to

access and manipulate data. In client/server development, this data is nearly (but not always) stored in a database. Visual Basic provides a dizzying array of options to access the database. The correct one to use is often difficult to ascertain. In this chapter, I discuss each of these access methods in turn and attempt to provide some guidance as to when to use each. I wish I could say that there is one right solution, but there isnt. The best general guidance I can offer is that you should base your decision on the current and anticipated skill levels of your staff as well as the current size and complexities of your application. The good news in all of this is Microsofts migration to Active Data Objects (ADO), which provides a one-stop interface to nearly all data sources, a high degree of compatibility with what you may have already developed using DAO or RDO, and performance metrics at least as robust as RDO.

Visual Basic Data Access Trends

Visual Basic 1.0 was released about the time Windows 3.0 came along. It was the much-ballyhooed successor to the DOS-based QuickBasic. There was some resentment at the time that the good folks at Microsoft had forsaken the DOS-based development community. Microsoft responded with the release of Visual Basic For DOS, which was essentially a bridge for those developers who, for whatever reason, were not yet ready to make the jump to Windows. It behaved much like VB for Windows with features such as event-driven programming and Windows-like controls, but the code itself still looked like the top-down Basic code of old. I believe that I might have been the only developer in the world to actually buy a copy. QuickBasic, Microsoft Basic Professional, Visual Basic For DOS, and Visual Basic 1.0 For Windows all had one thing in commonno real method of doing database programming. The explosion in popularity of the Internet left Microsoft scrambling to catch up with the likes of Netscape. Plus, the success of Visual Basic as a development tool left Mr. Gates and company scrambling to meet developers demands for industrial-strength enterprise development tools. Visual Basic 3 was the first real attempt to give Visual Basic real-world database development tools when Microsoft added Data Access Objects (DAO) version 1.0. DAO was a beginning but was hardly a panacea for large-scale development efforts and, in fact, was really only well suited for small-scale projects aimed at ISAM files such as FoxPro or Paradox. Version 4 of Visual Basic saw the introduction of Remote Data Objects (RDO), which removed much of the overhead involved in DAO, allowing direct access to ODBC rather than forcing everything through Microsoft Jet. Although laudable, RDO 1.0 was immature and placed a lot of restrictions on developers about what data sources they could connect to. At the same time, DAO saw some improvements in the underlying Jet engine that made it somewhat more practical to attach to ODBC data sources. VBSQL was added to allow native access to Microsoft SQL Server via DB-Lib (DB-Library). It is worth noting that the Microsoft development team essentially rewrote Visual Basic from the ground up from versions 3 through 5 to support the concept of add-in components, objects, and so on. Although it looks much the same on the outside, on the inside, VB4 (and now VB5 and VB6) are very different on the inside. Many projects developed in VB3 needed a fair amount of recoding. Version 5 of Visual Basic was really the first release where robust client/server development became a practical reality. DAO was refined with a new technology: ODBCDirect, which permitted the bypassing of the Jet engine, became a practical if not particularly scalable data model. Jet itself was further refined to increase performance and to offload some query processing on remote data sources that supported that processing. RDO 2.0 was a major enhancement over RDO 1.0, but careful testing of the Remote Data Control (RDC) revealed some serious flaws in its architecture. Although technically the ODBC API was always available, Microsoft talked more openly of using

the ODBC API as a means of database development. During VB5s development, Microsoft discussed the new Active Data Object model. However, the technology was not ready by the time VB5 was shipping (and, in fact, it will be evolving for some time to come), and most VB developers therefore really did not understand that using the ADO model was another option to development. When the purchaser of Visual Basic 6 opens the box, he or she will again see something that looks basically familiar. However, the client/server developer will quickly lose that illusion when poking around under the covers. Wizards that once generated DAO- or RDO-based forms dont even offer those data models as an option. The days of ADO have arrived, and ADO 1.5 is a welcome, robust, and scalable data model upon which to develop. The ADO Data control, although not backwards compatible with the RDC or the DAO Data control, is similar enough that most VB developers will have little problem adjusting. Best of all, ADO combines the best of DAO (access to ISAM databases) and RDO (efficient access to ODBC data sources) with a high-performance, low-overhead OLE DB engine. Although this book discusses all of the data models, the unmistakable trend is toward Active Data Objects. In the pages that follow, I discuss each of these data models with a brief overview and some examples. In the chapters that follow this one, each model is examined more in depth. For some developers, it will be worth the effort to convert existing projects to ADO, but for others, there will not be enough payback at this point in time to do so.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Whats Behind Door Number One?

I teach client/server development using Visual Basic (as well as other tools) to students with a wide variety of backgrounds. Even the most experienced developers are confused by Visual Basics approach to database development, and with good reason. VB really offered little in the way of database tools until version 3.0 when DAO and Microsoft Jet were added. Even the simple interface of DAO and its hierarchy of objects is enough to strike fear in the hearts of the object-purists among us. Still, once mastered, each of the approaches that I discuss offers amazing flexibility and, if approached methodically, can be conquered. Table 4.1 lists the 11 flavors of data access with Visual Basic and my own subjective opinion of relative power, stability, learning curve, and ease of use. Each is scored on a scale of 1 to 10; in all cases, a higher score is better. By power, I judge the robustness of the end product in terms of processing a lot of data efficiently and flexibly. A score of 10 indicates that the method is very powerful. As to stability, I am referring to the likelihood of your program crashing in the event of a program bug. A score of 1 means that the method listed is not very stable. Learning curve refers to my opinion of how difficult the technology is to master. A score of 10 means that the technology is easy to learn. Finally, when I use the expression ease of use, I am referring to how much programming is required to use the method. A score of 10 indicates that the method is very easy to use. Table 4.1 The varieties of data access methods with Visual Basic 6.0. Ease Of Method Power Stability Learning Use Flat File I/O 2 10 10 6 DAO/Jet Data controls 3 9 9 10

DAO/Jet* DAO/ODBCDirect Data controls DAO/ODBCDirect* RDO Data controls RDO* VBSQL ODBC API ADO Data controls ADO*
*Writing

4 5 6 6 7 9 10 9 10

7 8 7 9 8 7 4 10 9

5 8 5 8 5 5 1 8 5

6 9 5 9 6 4 1 9 7

all code instead of using the Data control.

Table 4.1 lists 11 separate methods of data access that all overlap at some point or other. Table 4.2 shows which types of databases each method can access. Flat file I/O (using VBs Open and Get keywords) is, of course, the roll-your-own approach and offers little to no realistic likelihood of accessing a true, relational database. Under some circumstances, flat file I/O may be appropriate for ISAM files. DAO/Jet (with or without the Data control) can process some sequential file formats but only to the extent that they are structured. DAO/Jet can also access most major relational databases. DAO/ODBCDirect (with or without the Data control) can access almost any relational database for which an ODBC driver is supplied and can access many ISAM formats as long as an ODBC driver has been defined. RDO is strictly bound to ODBC data sources, so it can connect to virtually all relational databases and many ISAM files. VBSQL is used with Microsoft SQL Server using DB-Lib. Older versions of Sybase SQL Server also support the DB-Lib interface and should, in theory at least, be accessible via VBSQL. The ODBC API method, like RDO, can access any ODBC data source. ADO is meant to cover the entire spectrum of data sources, including those that may be defined at some point in the future. Table 4.2 The types of databases and which data access methods can access each. Method Sequential ISAM Relational Flat File I/O DAO/Jet Data controls DAO/Jet* DAO/ODBCDirect Data controls DAO/ODBCDirect* RDO Data controls RDO* VBSQL Yes Limited Limited No No No No No Limited Yes Yes Limited Limited Limited Limited No No Yes Yes Yes Yes Yes Yes DB-Lib only

ODBC API ADO Data controls ADO*

*Writing

No Yes Yes

Limited Yes Yes

Yes Yes Yes

all code instead of using the Data control.

In the following sections, I discuss each of the 11 database access methods.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!

Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Flat File I/O

The Visual Basic commands to process flat (nonrelational) files, more commonly called sequential files, have been around for many years, dating back to the days of QuickBasic (and in some cases, before that). Although using flat file I/O will not be your primary means of processing data, it does have a place even in modern client/server applications. Consider the possibility that you are updating many thousands of records currently stored in a relational database. One of the down sides of RDBMSs is that there is a good deal of overhead involved in processing. I have worked on some applications where it was more efficient to drop a table into a sequential file (using an export utility of the database), update it the old-fashioned way, and then reload it to the database. I would never encourage this approach where not absolutely necessary because you lose the advantages of the databases built-in integrity constraints and so on. Flat file I/O is also well suited when you need to process unstructured data. Perhaps you have received a file of records that you need to add to the database. It may well be advantageous to read the file sequentially and add the records programmatically. Most Visual Basic developers who are reading this book are already fairly accomplished in the use of the VB language, so I have not devoted a separate chapter to flat file I/O. Instead, I outline the basics in this section. For additional guidance, refer to the Visual Basic documentation.

Opening And Closing Files

Visual Basic uses the Open command to open a file regardless of the mode in which the file is accessed. The syntax of the Open command is as follows: Open filename For open_mode [Access access_mode] [lock_mode] _ As [#]file_number [Len=record_length] The filename includes a fully qualified path to the file. If the file name is omitted, Visual Basic attempts to open the file in the current directory.

The five open_mode options are Append, Output, Input, Random, and Binary. The first three options are used where a file is to be accessed sequentially (as opposed to randomly). In Append mode, any data that you write is added to the end of the file. Conversely, in Output mode, any data overwrites the file from the beginning. I discuss Random and Binary modes in a moment. The default is Random. The access_mode determines whether the file can be read to or written from. The options are Read, Write, and ReadWrite. The default is ReadWrite. The lock_mode option specifies how the file locking is handled in a multiuser environment. The options are Shared, Lock Read, Lock Write, and Lock Read Write. If you open a file in Shared mode, any other process has full access to the file. In Lock Read mode, the file is locked against others reading it. Likewise, in Lock Write mode, other processes may not write to the file. The most restrictive (and default) mode is Lock Read Write, which prevents any other process from opening the file while your program has it open. If you have opened a file as Lock Write and another program attempts to open it in Output mode (for example), a runtime error in that program will be generated. You must assign an unused file_number when opening a file. All subsequent operations against the file reference the assigned file number. The number must be in the range of 1 to 511. Use numbers of 255 and below if the file will not be used by other processes. Use numbers above 255 if the file can be opened by other processes. Use the FreeFile function to return the next available file number: ' Return a file number from 1 to 255 Dim iFile1 As Integer iFile1 = FreeFile ' Return a file number from 256 to 511 Dim iFile2 As Integer iFile2 = FreeFile 1 The pound sign (#) before the file number is optional, and it is retained by Visual Basic for backward compatibility with Basic versions that did require it. The record_length argument is used with files opened in Random mode and is ignored if the file is opened in Binary mode. With files opened randomly, all records in the file are normally a fixed length. This provides the ability to locate a record quickly by simply supplying a record number. For example, if all records in a file are 100 bytes long, the operating system can use that to locate record number 4 by moving the file pointer to byte 301 in the file (assuming the first record begins at byte 1, the second record at byte 101, and so on). Of course, your program is responsible for maintaining the logic necessary to know where in the file a given record is. When a file is opened in Binary mode, you can move around the file by specifying an absolute byte offset from the beginning of the file or from the current location in the file. Once you have finished using a file, use the Close statement followed by the file number to close it. If you do not supply a file number, all currently opened files are closed. When a file is closed, the file number is disassociated from that file and can be used again for another file. You should be sure to close all open files, especially those opened in Output or Append modes, because that forces the changes to be written to disk. Failure to properly close the file can result in your changes being lost.

Listing 4.1 opens three files (in Append, Input, and Random modes). Notice that a user type EmpRec is created and used as an argument to the Len clause on the Random file; the file will be used to store employee records. Listing 4.1 Demonstration of several flat file access techniques. ' Variables to store file numbers Dim iFile1 As Integer Dim iFile2 As Integer Dim iFile3 As Integer ' Open a file in Append mode iFile1 = FreeFile Open "c:\program.log" For Append Lock Read Write As iFile1 ' Open a file in Input mode, read only iFile2 = FreeFile Open "c:\autoexec.log" For Input Read As iFile2 ' Create a user-defined data type for the next file Type EmpRec Emp_No As Integer Emp_Name As String * 30 Salary As Currency End Type ' Create a variable of type EmpRec Dim strEmp As EmpRec ' Open a file randomly using a record length iFile3 = FreeFile Open "c:\employee.dat" For Random As iFile3 Len = Len(strEmp) ' Read the record number 8 and display Get #iFile3, 8, strEmp MsgBox strEmp.Emp_Name & "'s salary is " & _ Format$(strEmp.Salary, "$###,##0.00" ' Close the first file Close iFile1 ' Close all other open files Close

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Reading Files
Use the Input # function to read a file opened in Binary or Input modes using the syntax Input # file_number, var1, var2. You use this function to read a comma-separated list of variables from a file: ' Read three variables from a file Dim myVar1, myVar2, myVar3 Open "c:\myfile.dat" For Input As 1 Input #1, myVar1, myVar2, myVar3 Close 1

About File Handles

The operating system is actually responsible for all file I/O, including opening and closing the files. When a file is opened, it has a handle, which is a Long address to the file. The file number you assign (or obtain using FreeFile) is not the same as the file handle. In 16-bit Windows, you could use the FileAttr function to determine in what mode a file has been opened and to return the file handle: Handle = FileAttr (1, 2) where 1 is the file number that you assigned when you opened the file and 2 is an action code telling Visual Basic you want to return the operating systems file handle. Unfortunately, this option is not supported in 32-bit Windows. You can, however, use FileAttr to obtain information about what mode a file was opened in. In this case, use an action code of 1: Mode = FileAttr(1, 1) where the first 1 is the file number you assigned when you opened the file and the second 1 is the action code. Visual Basic returns one of the following values: Input1 Output2

Random4 Append8 Binary32 The Input function (without the pound sign), also used with files opened in Binary or Input modes, differs from the Input # function in that it reads in a fixed number of characters from the file, including commas, carriage returns, and line feeds, as shown in the following example: Dim sBuffer As String Open "c:\autoexec.bat" For Input As 1 ' Read 100 characters sBuffer = Input (100, 1) Line Input is used to read an entire line of text, up to (but not including) the character-return and line-feed characters. The following code reads a file and displays each line in a ListBox control. The result is shown in Figure 4.1: Dim sBuffer As String ' Open the file Open "c:\autoexec.bat" For Input As 1 Do While Not EOF(1) Line Input #1, sBuffer ' List1 is an existing ListBox control List1.AddItem sBuffer Loop Close 1

Figure 4.1 Example of using Line Input to populate a ListBox control with the contents of a file. Notice that Line Input requires the use of the pound sign in front of the file number. Consistency is not Microsofts middle name. You use the Get function to read a file opened in Random or Binary mode. With Random mode, you supply a record number, whereas with Binary mode, you supply a byte offset. If the record number or byte offset is omitted, reading begins at the next record or byte in the file. The following statement reads a file that was previously opened randomly: ' Read the 55th record into the strEmpRec variable Get #iFile3, 55, strEmpRec The following statement reads the next 512 bytes into a variable from a file previously opened as Binary:

Dim sBuffer As String * 512 Get #iFile3, , sBuffer

Navigating Files
The current position in the file is said to be the file pointer. Therefore, if the current record is number 55 in the file, the file pointer is 55. The Seek statement moves the file pointer (that is, it sets the location in the file for the next read or write), whereas the Seek function returns the current file pointer: ' Move to the 30th record Seek #iFile3, 30 ' Read the next record Get #iFile3, , strEmpRec ' Display the current record number (31) MsgBox Seek (#iFile3) The Loc is the counterpart to Seek, returning the location of the last read or write. The counterpart to Get is Put, which writes a record (in Random mode) or the value of a variable (in Binary mode): Put #iFile3,, strEmpRec

Writing To Files
Put has additional functionality that varies somewhat depending on whether the file was opened for Binary or Random. For instance, you can write the contents of an array to disk. See the Visual Basic help file for additional information. The Write # statement writes data to a file in the same format that it is read with the Input # function. Write # places quotes around strings and commas after each variable. A carriage-return and line-feed sequence is written after each Write # operation. Dates are surrounded by pound signs (#December 25, 1998#). Boolean data is written as #TRUE# or #FALSE#. Print # writes formatted data to a sequential file. Strings do not have quotes and variables are not separated by commas. However, a carriage-return and line-feed sequence is written after each Print # operation unless the last character is a semicolon or comma. Print # is useful for writing formatted reports to a disk file for printing later. See the Visual Basic help file for more details on some of the ways that you can use Print #.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Other File Support Functions

The EOF and LOF functions are very useful when you perform flat file access. EOF returns a Boolean, indicating whether the end of the file has been reached. LOF returns the length of an open file. (You can use FileLen to determine the length of an unopened file.) The example for Line Input earlier in this chapter (see Figure 4.1) used EOF. Listing 4.2 provides an example of using LOF to perform a buffered read of a file. Listing 4.2 Using LOF and a method to process very large files with buffered reads. Dim sBuffer As String Dim lSize As Long Dim lRemaining As Long Dim iBufSize As String ' Open the file in binary mode Open "c:\windows\modem.log" For Binary As 1 ' Determine the length of the file lSize = LOF(1) ' Initialize the value of lRemaining lRemaining = lSize ' Loop through the file ' Pad the input buffer If lRemaining < 4096 Then ' If less than 4K characters remaining sBuffer = Space$(lRemaining) Else ' Otherwise, read a 4K chunk sBuffer = Space$(4096) End If

' Read the data Get #1,, sBuffer lRemaining = lRemaining - Len(sBuffer) If lRemaining < 1 Then Exit Do Loop The Visual Basic help file contains information on other functions that may be useful, such as FileDateTime (which returns the date and time when a file was last modified), GetAttr (which returns the attributes of a file such as Hidden or System), and SetAttr (which sets file attributes).

Data Access Objects

Data Access Objects (DAO) is the original model of the relational data access models for Visual Basic. The current version is DAO 4.0. In this section, I discuss the capabilities of DAO and its object models, and I review the DAO hierarchy. I go into much more depth on these subjects in Chapter 5.

Capabilities
With the exception of ADO, DAO is the most flexible of the data access methods available to the VB developer. It offers the ability to manipulate ISAM databases as well as relational databases, shielding the developer from many complexities and creating a layer where the developer can use a common code base to interact with multiple back ends. In other words, if the developer uses a reasonable amount of planning, he or she can use the same code to interact with both FoxPro and Oracle. The actual implementation of DAO is shown in Figure 4.2.

Figure 4.2 The relationship of DAO to the application and to the database. The original implementation of DAO was with Microsoft Jet, which then interacts with the ODBC Driver Manager. The ODBC Driver Manager handles all low-level interactions with the database itself. Unfortunately, what Jet gains in flexibility (because it can interface to virtually any data source), it loses in performance. Jet is robust in its capabilities but is a thick layer between your applications and the data, slowing down all database access. Also, Jet leaves a large footprint in memory; it occupies more than a megabyte of RAM. The seasoned (or should I say grizzled?) Visual Basic developer who wrote database applications with Visual Basic 3 used Jet version 1.1. This was a primitive implementation. Jets performance enhancements are derived in large part by allowing the back-end database to do what it does best: process queries. Jet 1.1 pulled all of the records from a table in a back-end database such as Oracle and then performed the query locally.

ODBCDirect was added to DAO with version 5 of Visual Basic. It is not as flexible as Jet in that it can only communicate with defined ODBC data sources. However, it bypasses Jet and communicates with RDO. It also offers some functionality improvements because it can access ODBC features not available when going through Jet. Because ODBCDirect is a thin layer leaving a minimal footprint in memory, it offers performance nearly as good as RDO itself. ODBCDirect essentially maps DAO functionality to RDO functionality. The RDO layer interacts with the ODBC Manager. In Chapter 5, I discuss the DAO object hierarchy in depth.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Using DAO
DAO has a fairly complex object hierarchy. (See Figures 5.1 and 5.2 in Chapter 5 for a graphical representation of this hierarchy). In this model, all data access objects exist in the context of a single DBEngine object.

Are DAO And Jet Dead?

One might easily jump to the conclusion that Jet, and by extension, DAO, are both in the coffin, if not actually buried. Microsoft SQL Server 7 is making a bold attempt to cover all portions of the enterprise from clustered servers down to a desktop database. (You can purchase a Windows 95 version of SQL Server that supports up to five users, although it is primarily aimed at the desktop user.) If you are predicting the demise of Jet in particular based upon Microsofts announcements regarding SQL Server 7, dont bet a lot of money. Although SQL Server 7 may, and probably will, eventually be a viable desktop platform, it is far from that today. Although Microsoft may have refined the product some by the time you read this, it appears that the memory requirements are about two to three times greater for SQL Server 7 than for Jet. SQL Server 7 for the desktop is about a 200MB install, and although Microsoft specifies a Pentium platform, my recommendation is to not run it with anything less than a 200MHz CPU with at least 64MB of RAM. This is not your typical desktop configuration. Microsoft has continued to refine and improve Jet. Even more, informal testing shows that not only is Jet 4.0 faster than its 3.0 counterpart, but it is also much faster than SQL Server 7 on a typical desktop PC. That is not to say that a future release of SQL Server will not change that discrepancy, but that day has not arrived yet. If anything is going to knock Jet off its desktop throne, it is ADO and OLE DB (which is part and parcel of SQL Server 7, by the way), which I discuss

later in this chapter. For new relational database development efforts, I see no compelling reason to use DAO over ADO because OLE DB seems to be a much more robust technology than Jet and especially because Microsoft is clearly going to concentrate its development bucks there. For the typical desktop development effort, I see no reason not to use Jet against ISAM-type databases, including MS Access. If more robustness is needed (such as logging), then I recommend a product such as Personal Oracle or Sybase SQL Anywhere, both of which offer good performance with modest resource requirements. Otherwise, Jet is a proven data engine technology for file-oriented databases. The DBEngine object contains the Workspaces collection, which consists of individual Workspace objects. Each Workspace object describes a current session with the database and includes the Databases collection. Each Database object includes several collections including QueryDefs and RecordSets. A QueryDef object describes an SQL select. Each RecordSet object is an active result set from the database. The actual implementation of the object hierarchy varies a little based on whether ODBCDirect is used. Because the objects themselves vary, there is also some variation in methods and events available to the VB developer. For instance, ODBCDirect offers the Cancel method to cancel a pending asynchronous query. DAO gives the developer the opportunity to use data-aware controls, greatly simplifying the task of DAO-based development. This opportunity comes at the expense of some flexibility in design but can turn 200 lines of code into 20 or fewer. For instance, the Data control automatically connects to the database, builds a RecordSet object, handles the chore of scrolling through the RecordSet, and performs all updates behind the scenes. Without the Data control, the VB developer has to code all of this functionality. Likewise, by having a Data control, the user can bind other data-aware controls such as the TextBox and the ListBox to individual columns in the Data controls RecordSet object. Without the Data control, the developer has to manually populate each TextBox (or other control) as the user scrolls through a RecordSet. Figure 4.3 shows a form module that uses the Data control and several TextBox controls to display records from the Employee table in the Access version of the sample database provided with this book.

Figure 4.3 The Employee application created using the Data control and very little coding.
TIP Using The Sample Code Because the application in Figure 4.3 uses an Access database without ODBC, the path to the database file is hard-coded in the Database name of the Data control. In addition, because the database is on your CD-ROM, the database cannot be updated. To run this code on your own, copy the project file and the form file to your hard drive along with the Access database.

Open the project and change the Database property of the Data control to reflect the new path.

I actually wrote only one line of code (shown below) to display the record number. The rest of the code was all generated as a result of drawing controls on the form and setting their properties as needed. Private Sub Data1_Reposition() Data1.Caption = "Record " & _ Str(Data1.Recordset.AbsolutePosition + 1) & _ " of " & Str(Data1.Recordset.RecordCount) End Sub In Chapter 5, I expand on the use of DAO, adding functionality such as error handling, data validation, and so on using the sample data provided with this book (see Appendix A). I will also walk through the process of coding a DAO-based application without using the Data control, as a means of illustrating the DAO object hierarchy and to provide additional flexibility to the application.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Remote Data Objects

Introduced with Visual Basic version 4.0, Remote Data Objects (RDO) introduced a major performance boost for client/server applications. You can use RDO with any backend database defined as an ODBC data source name (DSN) that supports the SQLNumParams ODBC function (see sidebar). Although some of the names of properties and objects have changed, RDO has a lot in common with DAO. Most VB developers should have little problem converting existing code to use RDO and even less problem grasping its usage. Where RDO will work (when working with any ODBC-defined database), it makes little sense to continue using DAO. However, before converting an application from DAO to RDO, you might want to carefully consider the benefits of converting to ADO instead. I discuss ADO later in this chapter.

RDO Capabilities
RDO puts a smaller layer between your application and the data than does DAO. This is shown in Figure 4.4. RDO basically places a thin wrapper around the ODBC API, allowing the developer most of the benefits of the ODBC API with few drawbacks. The chief advantages of RDO are: No need for a local query processor (Jet); query processing is done remotely. Smaller memory footprint. Event-driven asynchronous queries.

Figure 4.4 The Remote Data Objects model. The Visual Basic 5.0 implementation of RDO (RDO 2.0) and the Remote Data control (RDC) was a significant improvement over the original Visual Basic 4.0 implementation (RDO 1.0). For example, under Visual Basic 4.0, you had to continuously poll for the completion of a database operation. RDO 6.0, introduced with Visual Basic 6 (dont ask me what happened to versions 3, 4, and 5), is another improvement. Unlike DAO, in which data access is usually performed synchronously (meaning that the program has to wait for the result set to be returned before processing can continue), RDO permits asynchronous queries of the database. For example, if you set the RDCs Options property to rdAsyncEnable, the ResultSet is populated as a background task. When the query is complete (and the ResultSet completely populated), the QueryCompleted event is fired, providing an event-driven means for your application to know that the query has completed. The RDCs functionality is similar to the Data control, which most VB developers have used. Likewise, the Remote Data Object hierarchy is similar (though simplified) to the Data Access Object hierarchy.

Using RDO
Figure 4.5 shows an application created using the RDC that is almost identical to the DAO example in Figure 4.3. The snippet places the current row number into the Caption property of the RDC. The properties of the RDC are almost identical to those in the Data control. A few differences mostly reflect the SQL row orientation of the Remote Data control versus the file record orientation of the Data control. For instance, DAOs RecordSet property is equivalent to RDOs ResultSet property. DAOs RecordCount property is equivalent to RDOs RowCount property as shown in the following code snippet: Private Sub MSRDC1_Reposition() MSRDC1.Caption = "Record " & _ Str(MSRDC1.Resultset.AbsolutePosition) & _ " of " & Str(MSRDC1.Resultset.RowCount)

End Sub

Figure 4.5 The Employee application coded using the Remote Data control.

RDO And ODBC Compliance

The ODBC (Open Database Connectivity) standard is broken into three levels. Base level or Core level defines a minimal set of functionality that an ODBC driver must support. Level 1 defines additional functionality, whereas Level 2 defines yet another layer of more advanced functionality. A Level 2-compliant ODBC driver essentially supports the full set of SQL functionality defined by ANSI SQL. Appendix C lists the different functions required by each level along with the syntax for calling each from Visual Basic. RDO in Visual Basic 4 required a Level 2-compliant implementation, whereas RDO in VB5 and now VB6 has relaxed this requirement considerably to require only the SQLNumParams function.

VBSQL
VBSQL is a library for accessing Microsoft SQL Server via DB-Lib. Because Microsoft is heading away from DB-Lib and more toward an ODBC- and OLE DB-oriented access method to SQL Server, the use of VBSQL is not recommended except for existing projects already utilizing it.

VBSQL Capabilities
VBSQL is written to provide an interface to Microsoft SQL Server only. Specifically, any functionality exposed by SQL Servers DB-Lib is implemented by VBSQL. Connections tend to be cursor-oriented, and you are pretty much on your own to manually populate controls with information retrieved from the database (as opposed to binding controls to a data source). On the other hand, you have an unusual degree of control (relative to other Visual Basic data models) over the efficiency of your database processing. For instance, you can set the packet size for communicating between the database server and your application using the SQLSetPacket function with the SQLOpen function. (Note that you can determine the current packet size using SQLGetPacket but that once a connection is established, the packet size cannot be changed.) The packet size can be any size up to 64K and, if not specified, is determined by the current SQL Server default. If the default is set to 4,092 bytes but your application is mostly updating and retrieving single records, the large packet size slows down your application. On the other hand, if your application is processing a lot of data, a larger packet size is preferable to reduce the number of disk reads.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Using VBSQL
To use VBSQL, you need to visit the Microsoft Web site and download the VBSQL ActiveX control (look for VBSQL.OCX). The version I used in this book was a prerelease version, but a shipping version should be available by the time you read this. VBSQL requires the use of the VBSQL control, which is really more like a DLL than a Visual Basic control. You need to add to your project the VBSQL.BAS module, which is also available from the Microsoft Web site and included on the CD-ROM (be sure to check the Web site for a more recent release). The VBSQL.BAS module has a number of function calls defined, such as establishing a connection with the database. The declarations all reference the OCX file as shown in this code snippet: Declare Function SqlNextRow Lib "VBSQL.OCX" _ (ByVal SqlConn As Long) As Long Figure 4.6 shows a VBSQL application running in front of the VB6 development environment. A quick look at the two open code windows reveals that this coding, although not terrifically difficult, is tedious. Because Microsoft is in the process of eliminating the DB-Lib interface, and because the Active Data Object model is available, you have little reason to use VBSQL code. Because of the length of the code, and the fact that I dont recommend VBSQL when clearly better data access models are available, the entire 20 or so pages of code is not printed in the book.

Figure 4.6 The Employee application using VBSQL. To open a connection to the database, a sample code snippet might look like the

following: Dim lSQLConn As Long Dim lRtn As Long Dim sRtn As String ' Share with other procedures Private myLogIn As Long ' Initialize sRtn = SQLInit () ' Establish login record myLogIn = SQLLogIn () ' Set login parameters lRtn = SQLSetLUser (myLogIn, "Coriolis") lRtn = SQLSetLPwd (myLogIn, "Coriolis") ' Workstation lRtn = SQLSetLHost (myLogIn, "W243A") ' Can add other parameters lRtn = SQLSetLApp (myLogIn, "Coriolis VBSQL Sample App") ' Establish connection lSQLConn = SQLOpen (myLogIn, "Home")

The ODBC API

Perhaps the ugliest approach to data access using Visual Basic, at least in terms of complexity of code, is via the ODBC API. It is not for the faint of heart. At one time, particularly before the release of Visual Basic 5 (with its credible RDO version 2), the developer seeking to expose the robustness of the underlying ODBC had little choice but to drop down to the ODBC API. As with VBSQL, I see little reason to do so now, especially with ADO obviating more complex data models. Even still, just as the Visual Basic developer still finds legitimate need to drop to the Windows API for specialized functions, the VB client/server developer may well need to drop to the ODBC API at some point.

ODBC API Capabilities

The capabilities provided by the ODBC API are a function of both the backend database and the driver with which the ODBC manager communicates. For instance, I have used the Microsoft ODBC driver to access Oracle databases. I have also used Oracles own driver. As odd as it may sound, Microsofts is more functional under many circumstances. For instance, in a DAO environment, if you set the Data controls Options property to vbSQLPassThru (in order to tell Jet to let the back-end database handle the query), the Oracle ODBC driver can only process the tables as read-only. Appendix C documents some of the technical details of the API.

Using The ODBC API

Listing 4.3 provides a brief look at some of the steps involved in connecting to and manipulating an ODBC data source. In this case, I have coded the steps necessary to list the available ODBC data source names (DSNs). The application is shown in Figure 4.7.

Listing 4.3 Code from frmDSNList. Private Sub cmdClose_Click() End End Sub Private Sub cmdGetSources_Click() Dim iRtn As Integer, iDSNLen As Integer, iDescLen As Integer Dim hEnv As Long Dim sDSN As String * 32, sDesc As String * 128 ' Clear the list box List1.Clear ' Allocate env handle iRtn = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, _ hEnv) ' Set environmental variables iRtn = SQLSetEnvAttr(hEnv, SQL_ATTR_ODBC_VERSION, _ SQL_OV_ODBC3, SQL_IS_INTEGER) ' Get first data source iRtn = SQLDataSources(hEnv, SQL_FETCH_FIRST, sDSN, _ Len(sDSN), iDSNLen, sDesc, Len(sDesc), iDescLen) Do While iRtn = SQL_SUCCESS ' Add DSN to listbox List1.AddItem Left$(sDSN, iDSNLen) ' See if there are any more iRtn = SQLDataSources(hEnv, SQL_FETCH_NEXT, sDSN, _ Len(sDSN), iDSNLen, sDesc, Len(sDesc), iDescLen) Loop ' Free the handle iRtn = SQLFreeHandle(SQL_HANDLE_ENV, hEnv) ' Report the results Text1.Text = Str$(List1.ListCount) End Sub

Figure 4.7 Listing the ODBC data sources using the ODBC API.

The project, included on the CD-ROM, consists of a general code module (ODBC.BAS) containing API declarations and constants and a form module, frmDSNList.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Active Data ObjectsThe Future Is Now

One giant leap for Microsoft is an even bigger leap for us mere mortals who dont happen to own a lot of Microsoft stock. For those of us who have careers based at least in part on the use of Visual Basic to develop client/server applications (indeed, any database applications), Active Data Objects and its concomitant technologies (such as OLE DB) are a huge leap forward. Not only will most applications show anything from a modest to a drastic performance improvement, but also the implementation of the data model is simpler than either DAO or RDO. Perhaps even better, Microsoft has adopted a common data model to be used across all its development products, making objects as applicable in Visual C++ as they are in Visual Basic.

ADO Capabilities
ADO is actually an interface to OLE DB, which provides a common access point for both relational and nonrelational data structures as well as such disparate and nonstructured data sources as text, graphics, and email. In fact, with OLE DB, you can relate two entirely separate data sources to each other as long as there is an OLE DB driver written for each data source. For example, you could read an Employee table in an Oracle, Access, or DB2 database. You could then join the Emp_Email column in that table to the Pop3_Account_Name column in a Microsoft Outlook data source (when and if an OLE DB driver is written to access Outlook). Whereas ADO provides a high-level interface to the data, OLE DB is a low-level interface. By and large, VB developers can ignore the dirty business of bits and bytes in which OLE DB deals and concentrate on the streamlined interface provided by ADO. ADO encapsulates OLE DB functionality (and, by extension, data) and exposes it as objects. When dealing with OLE DB data sources, OLE DB is considered the data provider, whereas your application (and the Active Data Objects) is considered

the data consumer.

Using ADO And OLE DB

With ADO and OLE DB come new ways to look at and process data. Data-aware controls such as the TextBox have a new property, DataMember. You can still treat data sources in the old way: Assign to the DataSource property the name of an existing Data control (or Remote Data control or Active Data control) and assign to the DataField property a column that corresponds to one of the columns returned by the Data control. This is perfectly acceptable. However, the new data environment allows you to encapsulate the logic in a relational databases structure. Consider the sample database included with this book. An Orders table is related to the LineItem table. In turn, the Item table is related to the LineItem table. The relationship is shown graphically in Figure 4.8.

Figure 4.8 The relationship between the Orders, LineItem, and Item tables. Now, consider a selection from the Orders table along with all related rows in the LineItem and Item tables. This data hierarchy can be modeled using hierarchical cursors in the new Data Environment Designer. You can build a DataEnvironment object that encapsulates this data hierarchy. Then, you can assign to a data-aware controls DataSource property a DataEnvironment object instead of assigning a Data control to it. When you assign a DataEnvironment object to a data-aware control, the DataEnvironment object has certain properties that are exposed as data members. In Figure 4.9, I am creating a DataEnvironment object using the Data Environment Designer. After creating a DataEnvironment object, you can insert stored procedures previously created on the database, or you can add new Connection objects or Command objects. A Connection object represents a connection to the database, including various parameters such as user ID, password, and timeout. A Command object can be a table similar to a DAO RecordSource type, an existing SQL view or stored procedure, or an SQL statement. In Figure 4.9, I am creating an SQL statement type Command.

Figure 4.9 Creating the deADO-Example DataEnvironment object using the Data Environment Designer.

When placing a data-aware control on a form, you can set its DataSource property to any valid DataEnvironment object. Because the DataEnvironment object is a project-level object (as opposed to a form-level object, such as a Data control), any control on any form may access it. Once you have selected a DataEnvironment object as a controls DataSource property, you can use the controls DataMember to select from any of the Command objects of the DataEnvironment object. In Figure 4.10, I have just selected my deADOExample DataEnvironment object as the DataSource for the DataGrid control. I then selected the Ord_Rpt Command object as the DataMember property.

Figure 4.10 Using a DataEnvironment object as a DataSource of a data-bound control. Of course, none of this is to say that you cant use the ADO Active Data control (ADC) as you would the DAO Data control or the RDO Remote Data control. In Figure 4.11, I do exactly that. The figure shows two forms open inside an MDIForm. The top form shows the form that was designed in Figure 4.10, displaying rows of data from the three tables represented in Figure 4.8. The bottom form shows the same Employee Maintenance form that I created using DAO and RDO earlier in this chapter.

Figure 4.11 An MDIForm housing two other forms using a DataGrid control bound to a DataEnvironment object and some TextBox controls bound to an ADO Data control.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

The code for this application is included on the CD-ROM. To build it yourself, first create an MDI form named mdiADOExamples. Add two forms named frmADOEmployee and frmADOReport. Set both of their MDIChild properties to True. On the MDI form, add a File menu with three items: Report, Employee, and Exit. For the mnuFile-Report_Click event, code the following: frmADOReport.Show In the mnuFileEmployee_Click event, code the following: frmADOEmployee.Show On frmADOEmployee, add an ActiveData control. To do so, select Microsoft Active Data Objects from the Projects|Components menu to put the control on the toolbox. Set its properties as follows: Align=2 Align Bottom; Connect=DSN=Coriolis VB Example; DataSource Name=Coriolis VB Example; RecordSource=employee. Next, create an array of five textbox controls named txtEmpDat. The DataSource property for each should be Adodoc1. Set their DataField properties to: emp_no; emp_fname; emp_lname; emp_hire_date; and emp_salary. You can set the DataFormat properties for the date and salary fields as you see fit. Add label controls as shown in Figure 4.11 and then add the following code: Private Sub Adodoc1_MoveComplete (ByVal adReason As _ EventReasonEnum, ByVal pError As Error, adStatus _ As EventStatusEnum, ByVal pRecordSet As RecordSet) Adodc1.Caption = "Record " & _ Str$(Adodc1.RecordSet.AbsolutePosition) & " of " & _

Str$(Adodc1.RecordSet.RecordCount) End Sub On frmADOReport, add a DataGrid control named DataGrid1. Set the DataGrid controls properties as follows: AllowArrows=True; AllowSizing=True; Caption=Order Details; DataSource=deADOExample; DataMember=Ord_Rpt; WrapText=True. The DataField property for the columns should be set to: ord_no; ord_date; ord_cust_no; line_no; line_item_no; item_desc; line_qty; line_price; line_total. The Caption and Format properties should be set appropriately, such as those shown in Figure 4.11. deADOExample has one Connection named CorVBExample. Its SourceOfData property should be set to 3 - deUseOLEDBConnect-String. The ConnectionSource property should be set to Provider=MSDASQL.1; Password=coriolis; User ID=coriolis; DataSource=Coriolis VB Example. The Connection object has two Command objects named Orders_Only and Ord_Rpt. For Orders_Only, set the CommandType property to 2 - AdCmdTable. Set other properties as follows: CommandText=Coriolis.Orders; CursorLocation=3 - adUseClient; CursorType=3 - adUseStatic; and LockType=1 - adLockReadOnly. For the Ord_Rpt Command object, set properties as follows: CommandType=1 - adCmdText; CommandText=SELECT orders.ord_no, orders.ord_date, orders.ord_cust_no, line_item.line_no, line_item.line_item_no, item.item_desc, line_item.line_qty, line_item.line_price, line_item.line_total FROM item, line_item, orders WHERE item.item_no=line_item.line_item_no AND line_item.line_ord_no=orders.ord_no ORDER BY orders.ord_no, line_item.line_no; CursorLocation=3 - adUseClient; Cursor-Type= 3 adUseStatic; and LockType=1 - adLockReadOnly. In the CD-ROM code listing, you will see other Command objects as well, though they are not used in the sample application. Each Command object exists in the context of a Connection object. The Connection object must first exist, and it must be specified as the parent of the Command object. This brings us to the simplified (and non-hierarchical) object model of ADO relative to DAO and RDO, as shown in Figure 4.12. (Note that the figure itself is somewhat simplified. For instance, Errors is actually a collection of Error objects.)

Figure 4.12 The ADO data model. One of the most impressive and useful aspects of ADO and VB6 is that you can create a DataEnvironment object and use it on multiple forms (indeed, in any module). Contrast that to the Data control or the Remote Data control, which must be re-created for each form that will use them. This change is a leap

forward in the continued evolution of Visual Basic into an object-oriented development tool.

Previous Table of Contents Next

To access the contents, click the chapter and section titles.

Visual Basic 6 Client/Server Programming Gold Book

Go!
Keyword
Brief Full Advanced Search Search Tips

(Publisher: The Coriolis Group) Author(s): Michael MacDonald and Kurt Cagle ISBN: 1576102823 Publication Date: 10/01/98

Search this book:

Go!

Previous Table of Contents Next

-----------

Data Access As An ObjectIntroducing Data-Aware Classes

To use a Data control in your application, you paste the control onto a form and define properties that enable the control to communicate with the database. When the application is running, the Data control is essentially an instance of a classin other words, an object. In Chapter 5, I discuss creating data access objects and some of the theory behind it. For now, understand that a class is a definition of an object. It is not an object until instantiated. When you design a form in the development environment, you attach controls, set properties, and add event procedures and general procedures. (An event procedure is a predefined procedure that reacts to an event such as Command1_Click (). A general procedure is one that you create yourself as either a Sub or a Function.) All the properties, code, and so on of that form represent the class. They are the definition of the form. The form does not become an object until it is loaded into memory. The process of loading it into memory is called instantiationliterally, creating an instance of. All objects have data, state, and behavior. An objects data may include information from a database, but it also includes the simple variables that you use within your procedures. An objects state is really those things that describe the object. For example, a form has a Caption. An objects behavior describes what the object can do. A form can make itself visible by invoking its Show method. When a form includes a Data control, the Data control is part of the Form objects definition. It comprises a part of the class. Assume that you have developed a form named frmCustomer, which consists of a Data control, some TextBox controls bound to the Data control, and perhaps a menu and some buttons. If you, the reader, are a