0% found this document useful (0 votes)
18 views19 pages

05-4 Madden High Performance Distributed Computing

The HiPer-D program aims to transition advanced computing technology to military applications, focusing on eliminating scalability bottlenecks in systems like AEGIS by leveraging commercial off-the-shelf (COTS) technology. The document outlines various technologies, architecture concepts, and benefits for the Navy, emphasizing distributed processing, fault tolerance, and resource management. It also discusses the implementation of a Display State Server and dynamic resource management to maintain performance and reliability in real-time distributed systems.

Uploaded by

shreyas04.fatale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views19 pages

05-4 Madden High Performance Distributed Computing

The HiPer-D program aims to transition advanced computing technology to military applications, focusing on eliminating scalability bottlenecks in systems like AEGIS by leveraging commercial off-the-shelf (COTS) technology. The document outlines various technologies, architecture concepts, and benefits for the Navy, emphasizing distributed processing, fault tolerance, and resource management. It also discusses the implementation of a Display State Server and dynamic resource management to maintain performance and reliability in real-time distributed systems.

Uploaded by

shreyas04.fatale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Surface Warfare Center Division

HIGH PERFORMANCE
DISTRIBUTED COMPUTING

Leslie Madden & Paul Werme


NAVSEA Dahlgren Division

June 2002
Surface Warfare Center Division
HiPer-D PROGRAM
DARPA GOAL: HiPer-D Premise: AEGIS GOAL:
Transition Computing New Computer Program Eliminate Capacity &
Technology to Military & System Architecture Scalability Bottlenecks
Required to Fully Exploit
COTS Technology

Technology & architecture

DARPA Technologies Architecture Concepts Navy Benefits


• Advanced computers • Distributed processing • Load-invariant tactical
• Operating systems • Open systems performance
• Advanced networks • Portability • Information access
• Low latency protocols • Scalability • Mission flexibility
• Quality-of-service • Fault tolerance • Continuous availability
middleware • Shared resource mgt. • Rapid upgrades
• Resource management • Self-instrumented • Low ownership cost
June 2002
QoS REFERENCE ARCHITECTURE
Computer Client
Security User
Application
Mgt. Requirements
B A

Replication Application
QoS Enterprise
Services QoS Broker
Specs. Mgt.

Publish Distributed Group Appl. Ctrl


Subscribe Objects Ordered Agent Resource Resource
Control Allocation

Security Name Network Failure Resource &


B
Agent Service Monitor Monitor QoS Broker Appl. QoS Resource
Mgt. & Neg. Mgt. & Neg.

O/S Adaptation Layer

Operating Mid-level Auto-


Process Process
System Protocols Config.
Failure Startup
A
Server
Time Low-level Network Security Resource Network Application
Service I/O QoS Services Utilization QoS Broker

Computer / Network Hardware Computer


Network
Physical Media Hardware
HiPer-D DEMO BLOCK DIAGRAM
QOS SPECS
G
RESOURCE USAGE NETWORK QOS U
UAV
FIRE SIM 21 CFF Amaranth Truth N
BROKER Display VIDEO
CALL
ALLOCATION DECISION
TO RESOURCE QuO FOR S
O FIRE.
MANAGEMENT I
T TO RESOURCE
MISSION
MANAGEMENT M
H PRIORITIES
MSHN MMWS
FT CORBA
ALLOCATION DECISION SUWC AQuA
TO RESOURCE
MANAGEMENT
Globus DISPLAY
F
SHOOTERS AAWC C
TO RESOURCE STATE
MANAGEMENT S
SERVER
OTH DATA
L OPERATOR S
I
A T SERVER ACTIONS
S TO ALL
I
C N PLAN CLIENTS MANUAL M
K PSR SERVER
E TO ALL DECONFLICT
ENG CTRL
FROM
N AACT CLIENTS
RTS
REQUEST SM-2 ENG.
REC.
A DECONFLICT V
K CFF ENG SVR
R ABMX DECONFLICT T L
I A
I TRK NUM 3D SERVER AAW ENG SVR S
N B
O SERVER A
E M
D ENGAGEMENT W S
TBMD D
TO ALL SERVER - TBMD I
TRACK RADAR TRACK
CLIENTS DOCTRINE W
D S RADAR UPDATES ID SM ENG. W M
S DATA SERVER T DATA ORDER C
I
S TRACK CORR. AAW C
3 G
S
P & FILTER DOCTRINE S
A/S TENT. A/S ENG. REQ.
& P TRACK SPY-DECLARED AUTO SPECIAL
R Y
I SHIP’S POSITION
O ID REQUEST TO ALL CLIENTS
Q FROM AAW
APPS CONTROL
GYRO ID SYS RES. DWC
INSTRUM- RESOURCE ACTIONS
TO ALL Remos ENTATION MGT MGT NAV SERVER
CLIENTS
IFF SIM
MISSILE FLYOUT A
AAW, TBMD
Simulation AAW/TBMD Fault tolerant Land Attack C2/BMC DARPA
CALL FOR FIRE and/or Scalable
Surface Warfare Center Division
HiPer-D/AQuA EXPERIMENT
Display State Server AQuA FT CORBA Framework
• Replace a critical part of the HiPer-D tactical • Developed by University of Illinois & BBN
display subsystem. • Uses Ensemble Group Communications
• Maintain critical state pertaining to operator System and TAO ORB
responsibilities (submodes) and operator • AQuA Gateway allows FT application to
alerts issued by tactical processing (e.g. interact with other ORBs
engagement alerts) • Complies with the spirit of the FT CORBA
• Distribute display status to registered tactical Specification:
applications ! Strong Replica Consistency via state
• State critical - unacceptable to lose track of transfer, Ensemble reliable ordered multicast
operator responsibility assignments or to lose ! Warm passive & active replication
an unhandled alert ! Voting mechanism for tolerance of value
• Soft real-time - interfaces with displays and faults
multiple tactical components, including some ! Detection of & recovery from crash failures
with stringent real-time requirements ! Dependability manager to monitor replica
• Non-deterministic behavior - multiple status and manage replication level
threads of control – order of handling of ! Object factories to initiate new replicas
some events has the potential to affect • Value added:
resulting application actions and/or state ! Support for some types of non-determinism
via per object invocation state xfer
! Replicas managed per process not per object

Constraint: Fulfill all required functionality of the application within the context of
the existing system and with minimal change to the other applications.
June 2002
Surface Warfare Center Division
DSS ARCHITECTURE
Display State Server
register for state updates
Display Tactical
State Applications
Submode To

Control Control registered


applications

Display
Status
state changes Monitor
(cro, hooked track, balltab) Alert
post alert

submode actions
Control remove alert

Display alert Display


actions

State Submode Submode State


Router Handler Handler Router

Alert Alert
Display Handler Handler Display
Data Data
Handler Handler

June 2002
Surface Warfare Center Division
DSS/AQuA ARCHITECTURE
Warm-passive replicas using AQuA

Display
Display Display Display Display
State Server
Gateway Gateway Gateway Gateway Gateway

Maestro/Ensemble Group Communication System **

Gateway
Proteus Ensemble Gateway Gateway Gateway Gateway
Dependability Gossip Object Tactical Tactical Tactical
Manager Server Factory Application Application Application
Replication Manager –
replicas managed on a One object factory per
process not object basis node where replicated
process may be allocated

Application

* AQuA passive replication performs state transfer after each Application CORBA ORB
Interceptors transform
method call on/by a replicated object. IIOP
CORBA method calls
Gateway
** Ensemble GCS provides FT Group membership, heartbeats, CORBA ORB into Ensemble messages.
failure detection, ordered atomic multicast. Gateway
Ensemble
June 2002
HiPer-D DEMO BLOCK DIAGRAM
QOS SPECS
G
RESOURCE USAGE NETWORK QOS U
UAV
FIRE SIM 21 CFF Amaranth Truth N
BROKER Display VIDEO
CALL
ALLOCATION DECISION
TO RESOURCE QuO FOR S
O FIRE.
MANAGEMENT I
T TO RESOURCE
MISSION
MANAGEMENT M
H PRIORITIES
MSHN MMWS

ALLOCATION DECISION SUWC


TO RESOURCE
MANAGEMENT
Globus DISPLAY
F
SHOOTERS AAWC C
TO RESOURCE STATE
MANAGEMENT S
SERVER
OTH DATA
L OPERATOR S
I
A T SERVER ACTIONS
S TO ALL
I
C N PLAN CLIENTS MANUAL M
E K PSR SERVER ENG CTRL
TO ALL DECONFLICT
FROM
N AACT CLIENTS
RTS
REQUEST SM-2 ENG.
REC.
A DECONFLICT V
K CFF ENG SVR
R ABMX DECONFLICT T L
I A
I TRK NUM 3D SERVER AAW ENG SVR S
N B
O SERVER A
E M
D ENGAGEMENT W S
TBMD D
TO ALL SERVER - TBMD I
TRACK RADAR TRACK
CLIENTS DOCTRINE W
D S RADAR UPDATES ID SM ENG. W M
S DATA SERVER T DATA ORDER C
I
S TRACK CORR. AAW C
3 G
S
P & FILTER DOCTRINE S
A/S TENT. A/S ENG. REQ.
& P TRACK SPY-DECLARED AUTO SPECIAL
R Y
I SHIP’S POSITION
O ID REQUEST TO ALL CLIENTS
Q FROM AAW
APPS CONTROL
GYRO ID SYS RES. DWC
INSTRUM- RESOURCE ACTIONS
TO ALL Remos ENTATION MGT MGT NAV SERVER
CLIENTS
IFF SIM
MISSILE FLYOUT A
AAW, TBMD
Simulation AAW/TBMD Fault tolerant Land Attack C2/BMC DARPA
CALL FOR FIRE and/or Scalable
Surface Warfare Center Division
AAW ENGAGEMENT SERVER
Characteristics:
• Real-time – extremely fast
recovery needed to display
• Event timers subsystem to all clients
• Multiple threads of control
• Message ordering via group operator engagement status
potential
communications (multiple notification
groups) threats (alerts)
engagement to
• Out-of-band (pub/sub) manual
orders
communications Semi-active FT weapons
semi-auto control
• Inputs that trigger sequences of Framework
events engagement system
automatic
status
• Multiple data sources & events
that form a composite state auto-special Engagement action &
Processing state updates
Technical Challenges:
• Affect of non-determinism from primary
introduced by thread execution track status track ID action & to replicas
{out-of-band} track ID
order, timer expiration on request
data state updates
composite state from track mgmt
• Maintaining real-time subsystem to ID
performance requirements - even subsystem
during failure of a primary replica
• Replica restart – state includes
both data and events
June 2002
Surface Warfare Center Division
STRONG REPLICA CONSISTENCY

From CORBA 2.6 Specification, Section 25.1.3.4


… requires that the states of the members of an object group
remain consistent (identical) as methods are invoked on
the object group and as faults occur.
• Active Replication
… at the end of each method invocation on the object
group, all of the members of the group have the same
state.
• Passive Replication
… at the end of each state transfer, all of the members of
the object group have the same state.

June 2002
Surface Warfare Center Division
STATE CONSISTENCY
Requirement
• Maintain state data consistency among replicated
components of a real-time distributed system to the
degree required to effect replica fail-over within the Application
required time limits
Replica
Technical Issues
• Multiple objects within a single application process
may interact to create a composite state
Composite
• Multiple threads may be required with a single
process – possibly multiple threads per object State
• Event timers may be used to initiate periodic
processing or monitor for time-out conditions
• Out-of-band communications, sometimes via
unreliable channels (sensor interfaces, legacy
interfaces, interfaces to persistent media)
• Time-based computations (e.g. an algorithm that
extrapolates data forward in time)
• Sequences of actions resulting from single initiating
event
• State shared across objects in disparate, distributed
processes
• High throughput/low-latency interactions, with only
a small subset of communications affecting the state
that must be maintained across replicas
• State may include both application/object state and
infrastructure/ORB state June 2002
HiPer-D DYNAMIC
Surface Warfare Center Division
RESOURCE MANAGEMENT

• Distributed Monitoring and Control Infrastructure


• Management of Heterogeneous Pool of Network and Host Resources
• Dynamic Allocation and Reallocation of Applications within the
Computing Pool
• to maintain user-specified system performance goals.
• to respond to faults, tactical load changes, and mission changes

• Integrated Infrastructure Components:


• Monitoring
• Decision-Making
• Control
• Specifications

June 2002
HiPer-D DEMO 01
Surface Warfare Center Division
DRM ARCHITECTURE
Specifications Adaptive Resource Management Theater-Level
Theater-Level
Mission
Resource Policy-Selected MissionPriority
Priority
Resource Allocation Control
Control
QoS Manager
QoSSpec
Spec Manager Algorithms
Modifications
Modifications
Globus

System Host
HostLoad Network
NetworkLoad
System/ /QoS
QoS QoS Managers Load
Analyzer
Load
Analyzer
Specifications
Specifications Analyzer Analyzer

Monitor Control

Collector
Collector/ / History Host
Host Hardware
HardwareFault
Fault Application
Application
Correlator
Correlator Servers Discovery
Discovery Detection
Detection Control
Control

Quasar
Application
Application UNIX and WinNT Network
Network QuasarOS-Level
OS-Level UNIX
UNIXand
andNT
NTApp
App
OS Monitors Monitoring Feedback Control Agent
Instrumentation Monitoring Feedback Control Agent
Instrumentation Adaptation
Adaptation

Applications
Applications
June 2002
DEFINITION OF
Surface Warfare Center Division
DRM OPEN INTERFACES
• Ongoing effort to define and implement open DRM interfaces
• Use standards-based middleware where applicable
• Allow for incremental enhancement of DRM capabilities by research community
• Allow alternate components and algorithms to be easily integrated and evaluated

• Initial Open Interface Candidates:


• Accessing and updating system specification information
• Monitoring and updating host, network, and application status information
• Accessing application-level instrumentation information
• Providing alternate allocation/reallocation algorithms

June 2002
DEFINITION OF
Surface Warfare Center Division
DRM OPEN INTERFACES
• Approach:
• Define and document interface requirements:
• data requirements (needed information, data formats, etc...)
• control flow requirements
• QoS requirements (timing, scalability, survivability, security, etc...)
• Identify and evaluate relevant CORBA services
• determine services that support most or all interface requirements
• identify and document shortfalls
• Define detailed interfaces and API’s based on selected services
• Implement defined interfaces
• Evaluate performance of defined interfaces
• Document lessons-learned and iterate

June 2002
DRM OPEN INTERFACE
Surface Warfare Center Division
EXAMPLE
Example: Distribution and Updating of System Specification Information
• Spec Info Needed Specification
Throughout RM Files
Infrastructure
• Different RM
File
Components Require File
Read
Different Spec Info Write
Request
• 10’s to 100’s of Clients
Specification System
• Clients (and possibly Trusted Response Client
Modifications Specification
Clients Connections
Servers) Brought Up, and Updates Server Spec Update
Down, and Moved at
Notification
Run-Time

Interface Requirements:
• Client Specification Requests • Derived Requirements:
• Reliable Soft Real-Time (Bounded) Response Time Requirements • Fault Tolerance
• “Timely” Client Notification when Server(s) Go Down • State Consistency
• Specification Updates and Distribution of Updates: • Scalability
• Reliable Near Real-Time Pub/Sub Model Needed for Notification of Updates
• Soft RT requirements for Maintaining State Consistency Throughout System (low secs)
• Optional Security / Authentication needed for Spec Updates
June 2002
Surface Warfare Center Division
NETWORK QoS CONTROL
• Defining Network QoS Monitoring and Control Framework
• defining requirements, responsibilities, and interactions between:
• applications
• middleware
• resource management
• network QoS policy
• network QoS monitoring and control mechanisms

• Issues:
• At what level(s) should network QoS control mechanisms be accessible?
• Do Applications and Middleware require knowledge of and/or access to network
QoS control mechanisms?
• How should Network QoS Control Policy be incorporated into the framework?
• How to monitor Application and Middleware-level network information?
• How to control Application and Middleware-level network interfaces?

June 2002
RM NETWORK QoS
Surface Warfare Center Division
CONTROL EXPERIMENT
Local vs. Global View of Goal: RM Monitoring and Control of Network
Network QoS Control Actions: QoS Via Interception and Remapping of
Application / Middleware-Initiated QoS
• Application-level: Calls
• local perspective only
• knowledge of impact on application • Transparent RM Control of Network QoS
performance only • Ability to Monitor and Control Middleware
• Resource Management-level: Network QoS Activity
• system-wide perspective
• system-level view of impact across Dynamic
all applications Resource
Management
Application
Mission Priorities QoS Requests
Application Requirements Config Info
App Statuses Middleware
Requested
Network QoS Controls
Middleware
Resource QoS Validation
Ordered
Network QoS Controller Network QoS
QoS Controls
Control Mechanisms Control Mechanisms

Notional Layering for Application-Level Notional Layering and Interfaces with Resource
Control of Network QoS (via Middleware) Management for Network QoS Control
June 2002
RM NETWORK QoS
Surface Warfare Center Division
CONTROL EXPERIMENT
Scenario: Unmanned Aerial Vehicle (UAV) Video Pipeline
UAV Video Pipeline Video
VideoMode
Mode
• Surveillance: Low Priority - Disable Reservation on Link
Switch • Targeting: High Priority - Enable Reservation on Link
using TAO Switch
A/V Streaming Service CORBA
CORBA
Video Video Video
VideoDisplay
A/V Stream A/V Stream Pipe Video
Video Video Display Video
Source MPEG Distribution MPEG Broker Display
Source Distribution Broker Display
UAV Video Frames Frames
(MPEG File) • Video Pipeline developed as joint effort by
BBN, NSWCDD, OOMWorks, and WuSTL
RSVP-Enabled Network Link
• RAPI developed by USC/ISI

Experiment Architecture: 3 Control Cases:


Video Distribution Video Display • Application QoS
Dynamic * Application Broker Application requests will accepted,
Resource TAO A/V TAO A/V rejected, or modified
Management based on mission
Streaming Service Streaming Service
priorities and
Mission Priorities A/V Streams A/V Streams requirements
QoS Requests
Requirements Requested Pluggable Transport Pluggable Transport
Config Info • Asynchronous changes
App Statuses QoS Controls QoS Validation * QoS Validation * to network QoS may be
ACE QoS API ACE QoS API ordered based on
Network (AQoSA) (AQoSA) mission priority changes
Resource • QoS change indications
Ordered RSVP API (RAPI) RSVP API (RAPI)
Controller originating from network
* QoS Controls
devices will be handled
RSVP-Enabled Network
* new components June 2002

You might also like