Grid Computing:
Standards and Architecture
Martin F. Maldonado, Ph.D.
Technical Architect
IBM Grid Computing, Americas
[email protected]© 2003 IBM Corporation
Contents
• On Demand Business and Grid • Globus Project and Toolkit
Computing
• Autonomic Computing
• Grid Standards
• Additional Information
• Open Grid Services
Architecture
• Grid Services
• Data Access and Integration
Services
© 2003 IBM Corporation
Computing Evolution
On Demand
“Dynamic, Responsive, Integrated”
Network-Centric
“The Internet”
Client-Server
“PCs / LANS”
Mainframe
“The Glass House”
© 2003 IBM Corporation
On Demand Operating Environment Attributes
Open Integrated
…an approachable, adaptive,
integrated and reliable
infrastructure delivering on
demand services for on demand
business operations …
Virtualized Autonomic
© 2003 IBM Corporation
Virtualized
Storage Operating System
I/O
Processing Applications Data
Grid Computing
Distributed Computing Over a Network,
Using Open Standards to Enable
Heterogeneous Operations
© 2003 IBM Corporation
What is a Grid?
• There are three key criteria:
– Coordinates resources that are not subject to centralized control …
– Using standard, open, general-purpose protocols and interfaces …
– to deliver non-trivial qualities of service.
• What is not a Grid?
– A cluster, a network attached storage device, a scientific
instrument, a network, etc.
– Each is an important component of a Grid, but by itself does not
constitute a Grid
– The web is not (yet) a Grid; its open, general-purpose protocols
support access to distributed resource but not the coordinated use
of those resources to deliver interesting qualities of service
What is the Grid? A three point checklist, Ian Foster, GRIDToday, July 22, 2002, Vol 1 No. 6
© 2003 IBM Corporation
Grid Standards
© 2003 IBM Corporation
The Value of Open Standards
Distributed Computing:
Grid
(Globus -> OGSA)
Applications:
Web Services
(SOAP, WSDL, UDDI)
Operating System:
Linux
Information:
World-wide Web
(html, http, j2ee, xml)
Communications:
e-mail
(pop3,SMTP,Mime)
Networking:
The Internet
(TCP/IP)
© 2003 IBM Corporation
Open Grid Services Architecture (OGSA)
“The TCP/IP of Grid Computing”
OGSA
OGSAOGSAOGSA
Enabled
Enabled OGSA
Enabled
Enabled
OGSA Enabled
OGSA
© 2003 IBM Corporation
Global Grid Forum
• A community-initiated forum of 5000+ individual researchers and practitioners
working on distributed computing, or "grid" technologies.
• Formed in 2001 by a Merger of Grid Organizations
– European eGrid
– US Grid Forum
– Asia Pacific Grid Community
• Primary objective is to promote and support the development, deployment, and
implementation of Grid technologies and applications via the creation and
documentation of "best practices" - technical specifications, user experiences, and
implementation guidelines.
• Participants come from over 400 organizations in over 50 countries, with financial
and in-kind support coming from sponsor members including technology producers
and consumers, as well as academic and federal research institutions.
• Modeled After IETF and IRTF
– Meets Three Time Per Year
– Areas, Working Group and Research Groups
– Consensus Based
– Open Membership, Most Work Done on Mailing Lists
• IBM is a Platinum Sponsor Member
– Member of Steering Committee
– Member of External Advisory Committee
– Area Directors
Source: www.ggf.org
– Working Group Chairs
© 2003 IBM Corporation
GGF Sponsors
Charter Sponsor Members 2002 Gold Sponsor Members
Argonne National Laboratory Level 3 Communications
NASA Information Power Grid Intel
National Computational Science Alliance
(NCSA)
2002 Platinum Sponsor
San Diego Supercomputer Center
Members (SDSC)
Compaq
National Institute of Advanced Industrial
Hewlett Packard Science and Technology, Japan (AIST)
IBM
Microsoft
Platform Computing 2002 Silver Sponsor Members
Qwest Communications Avaki
Sun Microsystems Entropia
SGI Fujitsu America
US Department of Energy (DOE), Office Hitachi
of Scientific Computing Research InSORS Integrated Communications
Johnson & Johnson
US National Science Foundation, United Devices
Division for Advanced Computational University of Virginia
Infrastructure and Research (NSF-ACIR)
© 2003 IBM Corporation
Source: www.ggf.org
© 2003 IBM Corporation
IBM Active Industry Participation in GGF
APE
Boeing
ARCH
Avaki, Fujitsu, IBM, Platform, Sun (JINI only)
DATA
Avaki, IBM
GIS-PERF
Platform, IBM
SCHED
IBM, Intel, Sun
GS
IBM, Verisign
© 2003 IBM Corporation
Open Grid Services Architecture
© 2003 IBM Corporation
Open Grid Services Architecture Objectives
• Distributed Resource Management across heterogeneous
platforms
• Seamless QoS delivery
• Common Base for Autonomic Management Solutions
• Common infrastructure building blocks to avoid
"stovepipe solution towers"
• Open and Published Interfaces
• Industry-standard integration technologies
– web services, soap, xml...
• Seamless integration with existing IT resources
– Separate interface from implementation
© 2003 IBM Corporation
Distributed Computing: A Common Problem
• Web services, Autonomic computing and Grid efforts all try to
address aspects of distributed computing:
– Defining an open distributed computing paradigm.
– Dealing with heterogeneous platforms, protocols and applications.
• GRID has focused on Scientific / Technical Computing across
organizational boundaries
– Here, secure, distributed Resource Sharing is the key
– But no standards exist for inter-operability or pluggable components
• Web Services initial focus has been on application integration
– not resource provisioning or system integration
• Autonomic computing is focused on managing commercial IT
infrastructures:
– Here, sharing resources is not the issue: Managing them is!
– Sharing function is not the issue: Building solutions on top is!
© 2003 IBM Corporation
The Best of Both Worlds
Web Services & Grid Protocols
Open Grid Services
Architecture
share manage
access
Continuous
Availability
Applications on Resources
demand on demand
Secure and Global accessibility
universal access
Business Vast resource
integration scalability
Web Services Grid Protocols
© 2003 IBM Corporation
Architecture Framework
OGSA Structure
Applications
System Management
Grid Services
Services
Autonomic Capabilities
Professional Services
Open Grid Services Architecture (OGSA)
OGSI – Open Grid Services Infrastructure
Web Services
OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled
Security Workflow Database File Systems Directory Messaging
OGSA Enabled OGSA Enabled OGSA Enabled
Servers Storage Network
© 2003 IBM Corporation
Architecture Framework
OGSA Structure
Applications
System Management
System Management Services Grid Services
Grid Services
Services
Autonomic Capabilities
Professional Services
OGSI
OGSI – OpenGrid
– Open GridServices
Services Infrastructure
Infrastructure
Web Services
OGSA Enabled OGSA Enabled Web Services
OGSA Enabled
OGSA Enabled OGSA Enabled OGSA Enabled
Security Workflow General
Database Middleware
File Systems Directory Messaging
OGSA Enabled OGSA Enabled OGSA Enabled
Servers Storage Network
© 2003 IBM Corporation
Architecture Framework
OGSA Structure – OGSI
• Exploits existing web services properties
– Interface abstraction (WSDL)
– Protocol, language, hosting platform independence
System Management Services Grid Services
• Enhancement to web services
– State Management OGSI – Open Grid Services Infrastructure
– Event Notification
– Referenceable
Discovery
Handles
Lifecycle Registry Management Factory Notification HandleMap
– Lifecycle Management
– Service Data Extension Web Services
© 2003 IBM Corporation
Architecture Framework
OGSA Structure
System Management Services Grid Services
Cluster Resource Problem Data File Job Service
Policy Logging Provisioning
Management Management Determination Replication Transfer Scheduling Collections
OGSI – Open Grid Services Infrastructure
Discovery Lifecycle Registry Management Factory Notification HandleMap
Web Services
© 2003 IBM Corporation
Architecture Framework
Products and Services for Grids
System Management
Grid Services
Services
Autonomic Capabilities
IBM Global Services
OGSI – Open GridOGSA
t he globus proje c t
Services Infrastructure
tm
w w w .g lo bu s . o rg
OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled
Security Workflow Database File Systems Directory Messaging
OGSA Enabled OGSA Enabled OGSA Enabled
© 2003 IBM Corporation
Grid Services
© 2003 IBM Corporation
OGSA Services Model
Everything is represented by a (Grid) service
A service is a network-enabled entity that provides some capability
A service can be a computation resource, storage resource, network, program,
database, and so on
Services can be transient, created dynamically and destroyed when no longer
needed
Separates the definition of the interface and protocols to invoke the interface
Simplifies virtualization - encapsulation behind a common interface of diverse
implementations
Virtualization allows:
ƒ consistent resource access across multiple heterogeneous platforms with local and
remote transparency
ƒ enable mapping of multiple logical resource instances onto the same physical
resource
ƒ management of resources based on composition from lower-level resources
ƒ allows the composition of services to form more sophisticated services
© 2003 IBM Corporation
Hosting Environment
OGSA does not address issues of implementation
programming model, programming language,
implementation tools, or execution environment
Grid services are instantiated within a specific hosting
environment
Host environment defines how a Grid service meets it
obligation to Grid service semantics
ƒ rely on native operating system processes, implementing
service in a variety of languages
ƒ implemented on container or component-based hosting
environment such as J2EE, Websphere, .NET, and Sun One
© 2003 IBM Corporation
Open Grid Infrastructure (OGSI)
Anatomy of a Grid Service
GridService Other Interfaces •Service creation (Factory)
(required) (Optional) •Service discovery (Registry)
• Service Data Access
•Notification
• Lifetime Management
•Handle Management
•Other functions e.g.
•Workflow
Element
•Auditing
Handle
Handle Element
Service Data •Resource Management
Element
Grid Service
Implementation
Hosting Environment
© 2003 IBM Corporation
Open Grid Infrastructure (OGSI)
Grid Service Implementation Independence
Abstract service
interface remains the
same
Implementation
Hosting Environment
Other Middleware
Operating System
Hardware
© 2003 IBM Corporation
Open Grid Infrastructure (OGSI)
Grid Service Implementation - Examples
Abstract service
interface remains the
same
Registry
Service
Implementation
Hosting Environment - J2EE
JNDI JDBC
Other Middleware LDAP
Database (DB2)
Operating System
Hardware
© 2003 IBM Corporation
Open Grid Infrastructure (OGSI)
Grid Service Implementation - Examples
Abstract service
interface remains the
same
File Transfer
Service
Implementation
Hosting Environment - J2EE
Other Middleware
Database (DB2)
Operating System File System
Hardware Storage System (NAS/SAN)
© 2003 IBM Corporation
Grid Data Access and Integration
© 2003 IBM Corporation
Architectural Principals
• Heterogencity Transparency
– The access mechanism should be independent of the actual implementation
• Location Transparency
– An application should be able to access data irrespective of its location
• Name Transparency
– An application should be able to access data without knowing its name or
location
– Data access should be via logical domains, qualified by predicates on attributes
of the desired object
• Distribution Transparency
– An application should be able to query and update data without being aware
that it comes from a set of distributed sources
• Replication Transparency
– Grid data may be replicated or cached in many places for performance and
availability
• Ownership and Costing Transparency
– Applications should be spared from separately negotiating for access to
individual sources, whether in terms of access authorization, or in terms of
access costs.
Source: Grid Database Access and Integration: Requirements and Functionalities
© 2003 IBM Corporation
Principal portTypes
• GridDataService • GridDataTransport
– Service Data Elements – Service Data Elements
• Logical Schema • LogicallySupportedTypes
• Physical Schema • PhysicallySupportedTypes
• StatementNotificationTypes • activeBlocks
• ResultFormatTypes – Operations
• DatabaseTypes • perform
• SystemName
– Messages
• TransactionCapability
• GridDataTransportStatement
• preparedStatements
• GridDataTransportResponse
• resultCollections
• GridDataTransportFault
– Operation
• perform
– Messages
• gridDataServiceRequest
• gridDataServiceResponse
Source: Grid Database Service Specification
© 2003 IBM Corporation
Creating and Using Grid Data Services
Source: Grid Database Service Specification
© 2003 IBM Corporation
Requestor Retrieving Data from Grid Data Service
Source: Grid Database Service Specification
© 2003 IBM Corporation
Requestor Using Grid Services Ports
Source: Grid Database Service Specification
© 2003 IBM Corporation
Query Request with Deliver to Third Parties
Source: Grid Database Service Specification
© 2003 IBM Corporation
Sending Data from one GDS to Another
Source: Grid Database Service Specification
© 2003 IBM Corporation
IBM Technology Directions
Pluggable,
Federated DBMS Architecture 'wrappered'
Grid provider Grid provider data sources
SOAP Firewall 1 Firewall 2 DB2
over HTTPS
OGSA
Public Network
Grid
Grid Client Oracle
Services
Client Proxy
Web Services Oracle
Portal
Documentum
Web Services
Gateway Federated
Client DBMS
Firewall JDBC,
ODBC,
etc
Stoage Tank Infrastructure
© 2003 IBM Corporation
Globus Project and Toolkit
© 2003 IBM Corporation
Globus Project
• At its core, Globus is a research project. Globus research
focuses not only on the issues associated with building
computational grid infrastructures, but also on the problems that
arise in designing and developing application that use grid
services.
• Organized around four main activities.
– Research: study basic problems in areas such as resource management,
security, information services, and data management.
– Testbed: assist in planning and building large-scale testbeds, both for our
own research and for production use by scientists and engineers.
– Software Tools: We build robust research prototype software that runs on a
variety of interesting and important platforms.
– Applications: develop large-scale grid-enabled applications in collaboration
with scientists and engineers.
Source: www.globus.org
© 2003 IBM Corporation
Globus ToolkitTM
• The Globus Project provides software tools that make
it easier to build computational grids and grid-based
applications. These tools are collectively called the
Globus ToolkitTM.
• Is an open architecture, open source software toolkit.
• Is used by many organizations to build computational
grids that support their applications.
Source: www.globus.org
© 2003 IBM Corporation
Globus ToolkitTM Version 2.2 Layered Grid Architecture
Application
Internet Protocol Architecture
“Coordinating multiple resources”:
ubiquitous infrastructure services, Collective
app-specific distributed services Application
“Sharing single resources”:
negotiating access, controlling use Resource
“Talking to things”:
communication (Internet Connectivity Transport
protocols) & security Internet
“Controlling things locally”: Access
to, & control of, resources Fabric Link
“The Anatomy of the Grid: Enabling Scalable Virtual Organizations”, Foster, Kesselman,
Tuecke, Intl Journal of High Performance Computing Applications, 15(3), 2001.
© 2003 IBM Corporation
Globus ToolkitTM Version 2.2 Key Protocols
• The Globus Toolkit™ v2 (GT2)
centers around four key protocols
–Connectivity layer:
• Security: Grid Security Infrastructure (GSI)
–Resource layer:
• Resource Management: Grid Resource Allocation
Management (GRAM)
• Information: Grid Resource Information Protocol
(GRIP/LDAP)
• Data Transfer: Grid File Transfer Protocol (GridFTP)
• Also key collective layer protocols
–Monitoring & Discovery, Replication, etc.
© 2003 IBM Corporation
Globus Toolkit 2 Layered Grid Architecture
Protocols, Services, and APIs
Grid Protocols Globus Services Globus APIs
Applications utilize lower RSL, Compsite service APIs,
Applications Globus services at lower levels application level SDKs/APIs
GARA, MDS (GRIS, GIIS) GARA Client API
Collective
globus_gram_client, globus_rsl,
GRAM, GRIP, GridFTP, RSL,
Resource DUROC, GASS
globus_gram_myjob,
globus_duroc_control,
GSS-API globus_gss_assist
Connectivity
Fabric
© 2003 IBM Corporation
GT3 Architecture Overview
Workload Management
Other Grid
Services Diagnostics
Replica Management GT3 Data
Services
Managed Job Service File Streaming Service
Index Service GT3 Base Services
Reliable File Transfer
Secure Conversation GT3 Security Services Service
Service
GT3 Core
GridService
NotificationSink Registration
HandleResolver
NotificationSubscription
NotificationSource Factory
© 2003 IBM Corporation
Autonomic Computing
© 2003 IBM Corporation
Autonomic Vision
"Intelligent" open systems that...
ƒ Hide complexity
ƒ "Know" themselves
ƒ Adapt to unpredictable conditions
ƒ Continuously tune to meet
performance goals
ƒ Recover from failures
ƒ Provide a safe environment
Providing customers with...
ƒ Increased return on IT investment
ƒ Improved resiliency
ƒ Accelerated implementation of new capabilities
© 2003 IBM Corporation
Autonomic Computing
Self-Configuring Self-Healing
Adapt automatically to Discover,
the diagnose,
dynamically changing and react to
environments disruptions
Self-
Self- Self-
Self-
Configuring
Configuring Healing
Healing
Self-
Self- Self-
Self-
Optimizing
Optimizing Protecting
Protecting
Self-Optimizing Self-Protecting
Monitor and tune Anticipate, detect,
resources identify, and protect
automatically against attacks
from anywhere
© 2003 IBM Corporation
Autonomic Element
© 2003 IBM Corporation
Autonomic Components in a Hierarchy
© 2003 IBM Corporation
Self-Configuring Example: DB2 Configuration Advisor
Hardware
characteristic
detection
Basic Configuration Configuration
description model settings DB2 Configuration Advisor Results
250%
Expert
heuristics
200%
DB2 Configuration
Percentage of DBA
Advisor
Performance as
tuned Solution
150%
Speeds deployment 100%
Improves performance 50%
Frees up resource 0%
OLTP - 32 OLTP - 64 Cust #1 Cust #2
Default DBA Advisor as
configuration tuned percentage of tuned
© 2003 IBM Corporation
Autonomic Examples
Systems 9Access / Identity Managers
Management 9Storage Resource Manager
9Service Level Advisor
Client 9ImageUltra
9Rapid Restore PC
ThinkVantage Think
Acce s s orie s ThinkVantage 9Embedded Security Subsystem
Te chnologies a nd Se rvice s De s ign
Application 9Prioritization of User Transactions
9Custom Advisors
9Problem Analysis and Recovery
Database & Product
Store
9DB2 Query Patroller
Collaboration Month
9Tivoli Analyzer for Domino
Servers 9Dynamic Partitioning
9IBM Director
9BladeCenter
Storage 9Intelligent cache configuration
9Predictive Failure Analysis
9Dynamic volume expansion
© 2003 IBM Corporation
Additional Information
© 2003 IBM Corporation
Introduction to Grid Computing Video
• Available at
www.ibm.com/grid
• View online or download
Content:
What is Grid Computing
Benefits of Grid Computing
OGSA
Customer Testimonials
© 2003 IBM Corporation
ITSO Redbook
• Redbook: Introduction to Grid
Computing with Globus
• Available:
December 2002
Download from www.redbooks.ibm.com
Content:
Presents the architecture and components to design a
Grid solution by using the Globus 2.0 Toolkit
Explains different Grid types
Architecture and security considerations
OGSA and Grid middleware
Showcases several real-life application examples
© 2003 IBM Corporation
Learning Services Class
• Course: Introduction to Grid Computing, the
Globus Toolkit and OGSA
Content:
2-day class, lecture-only
Based on the Globus tutorial of same name
Technical introduction both to Grid computing and the
Globus Toolkit incl. descriptions of the core components
Usage of the Globus Toolkit in various applications
Future directions of Grid computing and the Globus Toolkit
More Courses planned for 2003 (e.g. Globus Developers+Admin Toolkits)
© 2003 IBM Corporation
Grid and Autonomic Computing Information
www.ibm.com/grid
www.ibm.com/autonomic
© 2003 IBM Corporation
Questions?
© 2003 IBM Corporation