Clustering Technology
Overview
Clustering with Linux
Sungho Kim , Ph.D.
President
KESPER Inc.
Agenda
Linux Overview
Linux Kernel Features
Linux Network Protocols Overview
Linux Clustering Overview
Network Protocols for Clustering
HPC Cluster
Internet Cluster
HA Cluster
Conclusions
Linux Overview
Multi-user Multitasking Unix-like OS
Multi-architecture, multi-platform OS
Freely distributable open source OS : GNU software
IEEE POSIX compliance
Wide range of peripherals supports
Wide configurability : From embedded to supercomputer
X Window Support
Full Network awareness
- Various Network Protocol Support
- TCP/IP, IPX/SPX, Appletalk, Samba, NFS, Web, Mail, etc
32/64 bits Full supports
Linux Hardware
Systems CPU
IBM PC and compatibles Intel x86, AMD, Cyrix
Apple Macintosh : Alpha EV5, EV6 (64-bit)
from m68000 to powerp
PowerPC
c
SUN Sparc, UltraSparc(64-bit)
SGI M68k
Atari/Amiga Strong/ARM
Compaq alpha MIPS
Netwinder
Linux & Network
Network Interface Cards Network Applications
10/100/1000 MB/s Web Server : Apache , Netscape
Myrinet
DHCP Server : dhcpd
ATM
FTP Server : proftpd ( ftp, ncftp )
Token Ring / FDDI / HIPPI
ARCnet Mail Server : sendmail / pine, mutt, el
ISDN
m
X.25 pop3 / imap / procmail
Frame Relay mailing list : majordomo
Fibre Channel Chatting Server : irc ( bitchx, irc )
WAN File Server : Samba
News Server : innd ( tin, pine, trn )
DNS Server : bind
NIS Server : NIS
Kernel Features
Kernel Options of 2.2.x
Code maturity level options Character devices
Processor type and features Mice
Loadable module support Video for Linux
General Setup Joystick support
Plug and Play support Ftape, the floppy tape device driv
Block devices er
Networking options File systems
SCSI options Network File Systems
SCSI low-level drivers Partition types
Network device support Console drivers
Amateur Radio support Sound
ISDN subsystem Kernel Hacking
CD-ROM drivers
Specific Features
Status of Kernel 2.4-test
USB supports
Logical Volume Manager
Ext3 Journaling File Systems
IrDA driver updates
Gas using instead of as86
Athlon supports
QuickCAM support
XFree86 DRI (Direct Rendering Interface)
Kernel HTTPD supports
Direct decompressing from Flash or ROM
I2O driver updates
DVD filesystem (udf) supports
Network Protocols
Supported Kernel Network Features
TCP/IP Protocol
IPX
Multicasting ( MBONE )
Tunneling ( GRE / Mobile-IP )
VPN
Advanced Router
WAN Router ( WAN Card + Linux )
Frame Relay / X.25 / leased line
HIPPI ( Cluster and Supercomputer )
Token Ring
IP Masquerading ( NAT )
IP Alias ( Virtual IP / Virtual domain )
Bridging ( Bridging / Load Balancing )
ISDN / xDSL
Linux Networking Network Protocols
Other Network Protocols
EQL ( Serial Line load balancing )
SLIP ( Serial Line Interface Protocol )
CSLIP (Compressed Serial Line Interface Protocol )
PPP ( Point-to-Point Protocol )
PLIP ( Parallel Line Interface Protocol )
X.25 : PLP ( Packet Layer Protocol )
HIPPI ( High Performance Parallel Interface )
FDDI ( Fiber Distributed Data Interface )
IPv6 ( IPng ) : Experimental
ARCnet
SNMP
Cluster System Overview
Category of Cluster Systems
Categories depend on their configuration method and applied areas
HPC Cluster Computation-intensive
Bulk Storage Cluster Stored Data sharing and service
Web/Internet Cluster Network load distribution and LB
HA Cluster Increase the Availability of systems
Components : Network + OS + Storage + API
HPC : High Performance Computing
HA : High Availability
LB : Load Balancer
Linux Cluster Network
IP Tunneling Filtering
Encapsulating data of protocol IP Packet filtering
VPN ( Virtual Private Network ) Linux Socket filtering
GRE tunneling (BSD socket filtering)
– Generic Routing Encapsulation Unix domain socket filtering
– CISCO Router ( X-windows, syslog )
Mobile-IP for laptop Firewall packet filer/IP masqueradi
ng
IP Firewalls/Masquerading
IP : kernel level autoconfig
NAT ( Network Address Translation )
Modified firewall Network booting
IP auto forward X terminal
IP port forward
TFTP / BOOTP / RARP
Linux HPC Cluster
Clustering Technology
A Bunch of computers to execute some jobs in parallel with
multiple computers and pre-configured networks
Beowulf : Linux based Cluster
Characteristics of Clusters
High Availability and expandability
High Performance/price
Personal
Supercomputer
Linux HPC Cluster
Components of Clusters
Hardware
CPU : Intel Pentium, Digital Alpha, Mac G3
Network : Ethernet, Myrinet, ATM, Gigabit Ethernet
Storage : Fibre Channel/SCSI RAID
Software
Operating System & Compiler : Linux, Windows NT,
DEC OSF
Communication Library : PVM, MPI
Administration Tool : CMS
Queuing Software : DQS, PBS
Application Libraries : BLACS, ATLAS, ScaLa
pack, PBLAS
Linux HPC Cluster
AVALON - Los Alamos National Lab.
Configuration of Hardware systems
Network Configuration
3Com SuperStack II 3Com SuperStack II switched network of
4x 3900 36-port fast + 9300 12-port Gigabit = 144 fast ethernet ports
ethernet switches Ethernet switch
Cyclades multiport serial switches Cost : about $300 per port.
Node Configuration (140)
533MHz Alpha 21164A microprocessor
DEC AlphaPC 164LX motherboard
ECC SDRAM DIMMs (256 Mbytes total per node)
Quantum Fireball ST3.2A 3079Mb EIDE U-ATA drive
Kingston ethernet card with a DEC Tulip chipset
Linux HPC Cluster
3COM 9300 1G eth.
3900 3900
Node 0
Node 0
3900 3900
Node 0 Node 0
Node 35
Node105
Node 70 Node 140
Linux HPC Cluster
Software Configurations
OS : RedHat Linux 5.0, kernel 2.1.125
MPICH and own basic set of MPI routines
Compiler : egcs 1.1b
Application Programs
SPaSM
Gravitational tree code
Linux HPC Cluster
Performance (113/500)
70 nodes 140 nodes
Linpack benchmark 19.7 GFlops 47.7 GFlops
SPaSM 12.8 GFlops 29.6 GFlops
Gravitational treecode 10.0 GFlops -
Price vs Performance
Price of Avalon : $313,000
Avalon’s Performance
= 64 CPUs 195 Mhz SGI Origin 2000
(SPaSM, Tree code, and Linpack)
Price of 64 CPUs SGI Origin 2000
= over 100 M$
Network Configuration
Simple Network Connection
Nodes have Internet IP Addresses
Intranet
DSU/Router
Internet
LAN/WAN
Server 1 Server n
Cluster Server Farm
Network Configuration
Double Network Connection
Nodes have Internet IP Addresses and Local IP Addresses
Intranet
DSU/Router
Internet
LAN/WAN
Server 1 Server n
Second-layer Network
Cluster Server Farm
Network Configuration
Double Network Connection + Master-Slave(NAT) configuration
Nodes have local IP Addresses
Intranet
DSU/Router
Internet
Master Server
LAN/WAN
Slave Server 1 Slave Server n
Second-layer Network
Cluster Server Farm
Crossbar Inter-connection
Second-layer Network connection
with Cross-bar connection on 32 Node Cluster
16 16
N N
o o
d d
e e
s s
32 Host Bus Adapters
12 Switches
64 Cables
I/O Connection
Keyboard-Video-Mouse and Disk IO connection
Master (IO controller)
Node 0 (IO controller)
SCSI / FC
Console Node 1 (IO controller)
Splitter RAID Controller(0)
Switcher
Node 2 (IO controller)
RAID Controller(1)
Node 3 (IO controller)
Monitor
Keyboard
Mouse Node 4 (IO controller)
Node 5 (IO controller)
Node 6 (IO controller)
Internet Cluster
Virtual Internet Cluster Server
Scalable and highly available server built on a cluster of real servers
The architecture of the cluster is transparent to end users and the users se
e only a single virtual server.
Methods to build Virtual Internet Cluster Server
Virtual Server via NAT
Virtual Server via IP tunneling
Virtual Server via IP filtering
Virtual Server via Direct Routing
Ref : www.linux-vs.org
Internet Cluster
Internet Cluster
Internet Cluster
Virtual Internet Cluster Server via NAT
This is done by network address port translation.
The code is implemented on Linux IP Masquerading codes and port forwa
rding code are reused.
Refer ipfwadm command.
All the process are figured out.
Internet Cluster
Intranet
DSU/Router
Internet
User
L4 Switch
Load Balancer
Linux Box
LAN/WAN
Real Server 1 Real Server n
Virtual Cluster Server via NAT
Internet Cluster
How This Cluster Works ?
(1) requests DSU/Router
(5) replies
User (4) rewriting replies
Load Balancer
Linux Box (2) Scheduling &
rewriting packets
LAN/WAN
(3) Processing
Real Server n
The requests
Virtual Cluster Server via NAT
Internet Cluster
Virtual Internet Cluster Server via IP Tunneling
IP Tunneling (IP encapsulation) is a technique to encapsulate IP datagram
within IP datagrams, which allows datagrams destined for on IP addre
ss to be wrapped and redirected to another IP address.
IP encapsulating is now commonly used
in Extranet, Mobile-IP, IP-Multicast,
tunneled host or network.
The load balancer encapsulates the packet
and forwarded to the server.
When the server receives the encapsulated
packet, it decapsulates the packet and
processes the request, finally return
the result directly to the user.
Refer NET-3-HOWTO command.
Internet Cluster
(1) requests
DSU/Router
Replies going to the user directly
Internet
User
Load Balancer
Linux Box
Virtual IP address
is assigned
IP
el
Tu
unn
IP T
nn
el
LAN/WAN
Real Server 1 Real Server n
Virtual Cluster Server via IP Tunneling
Internet Cluster
(1) requests
DSU/Router
Internet
User
(2) encapsulation
Load Balancer
Linux Box
Virtual IP address
is assigned
LAN/WAN
Real Server 1 Real Server n
(3) de-encapsulation & reply to user
Virtual Cluster Server via IP Tunneling
Storage Cluster
Network is configured with one of the virtual cluster server techniques.
The disk storage is connected with Fibre Channel including SAN file systems.
Internet DSU/Router
LAN/WAN
Fibre Channel half-duplex :
FC Switch 100MBytes/sec
full-duplex :
200MBytes/sec
Fibre RAID Storage
Linux Storage Cluster with GFS or SAN
Storage Cluster
SAN(Storage Area Network)
–Scalability 125 disks w/ one controller
–Easiness of Management
–Fast Disk I/O Speed 100 Mbytes/sec ( half-duplex ), 200Mbytes/sec (full-duplex)
–Long Distance over 10 km (fiber-optical cable)
Fibre Channel
Fibre Channel Switch half-duplex :
100MBytes/sec
full-duplex :
200MBytes/sec
Fibre RAID Storage
Linux Storage Cluster with GFS or SAN
Storage Cluster
Linux Supporting File Systems
– ext2/ext3 file systems
– ISO 9660 (CD-FS)
– VFAT / FAT
– SMB (CIFS)
– UFS
– NTFS
– UDF ( DVD-FS )
– NFS / CODA
– LVM ( Logical Volume Manager )
– GFS ( Global File Systems )
– Reiser FS ( Journaling File Systems ), SGI XFS, IBM JFS
– RIO ( Raw IO )
– RAMFS
– ROMFS
GFS Storage Cluster
Feature Overview about GFS
The Global File System (GFS) allows multiple Linux machine to
share storage devices over a network. Each machine sees the
network disks as local, and GFS itself appears as a local file
system. Writes to a file by one Linux machine are seen by
another machine that later reads that file.
GFS Cluster Configuration
Normal Configuration
GFS Cluster Configuration
Complex Configuration
Cross-bar FC connection
GFS Cluster Configuration
NFS Configuration
GFS Configuration
Hybrid Configuration
GFS Cluster Performance
High Availability Cluster
Enterprise Server Requirements
Reliability
Non-Stop
Fault-Tolerant
Cluster
HA
Av
Server
y
ilit
ail
Stand-alone
ea b
ab
ilit
vic
y
Ser
High Availability Cluster
Server Downtime Cost due to Downtime
Un-Planned Downtime Jobs Cost per hour
Hardware Fault Stock Exchange 5.6 ~ 7.3M$
Credit Card 2.2 ~ 3.1M$
Software Fault
TV Shopping 87 ~ 140 K$
Planned Downtime Sell Products 60 ~ 120 K$
Hardware exchange Air- ticket reservation 67 ~ 112K$
Hardware Upgrade ATM Fee 12 ~ 17K$
O/S upgrade
Software upgrade
High Availability Cluster
Comparison of Availability
Downtime/ year(
Architecture Max. availability DownTime/ failure
minutes)
Continuous
10 0 .0 0 % N one 0
Processing
Fault Tolerent 99,999% Cycles 0 .5 ~ 5
Seconds to
Clusters 99.9 ~ 99,999% 5 ~ 50 0
minutes
50 0 ~ 10 ,0 0 0
High Availability 99.9% Minutes
( Disk Mirroring)
1,0 0 0 ~ 10 ,0 0 0
Server 99.5% Houre
( Disk Mirroring)
2,60 0 ~ 10 ,0 0 0
Stand- Alone 99% Houre ( Without Disk
Mirroring)
High Availability Cluster
Concept of HA system
Dual Network for Response
Heartbeat
Active or Standby Systems
Dual IO connection for Storage
Shared Storage
High Availability Cluster
Lines of heartbeat
Dual Network for Response
Serial Connection
TCP/IP over LAN
Shared SCSI Heartbeat
Components of HA Active or
Standby Systems
Dual IO connection for Storage
Redundant Systems
All connectable lines
Shared Storage
Shared Disks
Filesystem
Management software
including Heartbeat checking daemon
Ref : www.linux-ha.org
Concluding Remarks
Linux Internet Cluster Products
Wyz Cluster
DR Cluster
Mission Critical Linux
Red Hat : piranha, High Availability Server
Turbo Linux : Turbo Cluster Server
VA Linux : VACM
Legato Cluster
Veritas
…etc…
Linux Clustering is a starting point that Linux can enter the
enterprise market.
Until now, however, the clustering technology is one of major
considerations of technical development group like institutes
or academies.
Concluding Remarks
Why Linux Cluster?
Cost Effective and Easy configurability
Fast technical development with open source
Many references in various fields
Future Needs
New network configurability and TCP/IP stack performance.
High-Availability for enterprise markets
Cluster filesystem and disk I/O performance
High performance peripheral drivers
Stable management and scheduler
Concluding Remarks
Do Not Myth !
Clustering technology is matured enough ?
Easiness and stability are acquired ?
The clustering is a big market ? If, any field ?
Linux is in enterprise market ? If not, backend system ?
Linux vendor can maintain their advantages ?
Thank You !!!
KESPER Inc.
RM 803 DongA Officetel BongMyeong-Dong YuSeong-Gu
Taejeon 305-709
Republic of Korea (South Korea)
Tel. 82-42-828-7458
Fax 82-42-828-7455
Sungho Kim, President/CEO
[email protected] or
[email protected]