Daresbury Laboratory
The DL_POLY Tutorial
W. Smith Computational Science and
Engineering Department
CCLRC Daresbury Laboratory
Warrington WA4 4AD
Computational Science and Engineering
Daresbury Laboratory
Programme
A.M.
Overview of DL_POLY packages
MD on distributed parallel computers
DL_POLY under the hood
DL_POLY on HPCx
The DL_POLY GUI
Overview of DL_POLY Hands-on session
PM
DL_POLY Hands-on session
Computational Science and Engineering
Daresbury Laboratory
Part 1
Overview of the DL_POLY Packages
Computational Science and Engineering
Daresbury Laboratory
DL_POLY Background
General purpose parallel MD code
Developed at Daresbury
Laboratory for CCP5 1994-today
Available free of charge (under
licence) to University researchers
world-wide
Computational Science and Engineering
Daresbury Laboratory
DL_POLY Versions
DL_POLY_2
Replicated Data, up to 30,000 atoms
Full force field and molecular description
DL_POLY_3
Domain Decomposition, up to 1,000,000
atoms
Full force field but no rigid body description.
Computational Science and Engineering
Daresbury Laboratory
The DL_POLY Force Field
rr r N' r r 1 N' qiqj
V(r1, r2,.....,rN) = Upair (| ri rj |)+ r r +
i,j 4 i,j | ri rj |
N'
n
( ri, rj, rk )+ U4body ( ri, rj, rk, rn )+metal C1/2i+
N' N' N
rrr rrr r
U
i,j,k
3body
i,j,k,n i,j rij i=1
Nbond Nangle
r r r Ndihed
U bond (ibond, ra, rb ) + Uangle (iangle, ra, rb, rc )+ Udihed (idihed, ra, rb, rc, rd ) +
rr rrrr
ibond iangle idihed
Ninvers
rrrr N r
U invers (iinvers, ra, rb, rc, rd ) +external (ri )
iinvers i=1
Computational Science and Engineering
Daresbury Laboratory
DL_POLY Force Field
Intermolecular forces
All common van de Waals potentials
Sutton Chen many-body potential
3-body angle forces (SiO2)
4-body inversion forces (BO3)
Intramolecular forces
Bonds, angle, dihedrals, inversions
Computational Science and Engineering
Daresbury Laboratory
DL_POLY Force Field
Coulombic forces
Ewald* & SPME (3D), HK Ewald* (2D),
Adiabatic shell model, Reaction field,
Neutral groups*, Bare Coulombic,
Shifted Coulombic
Externally applied field
Walled cells,electric field,shear field, etc
* Not in DL_POLY_3
Computational Science and Engineering
Daresbury Laboratory
Boundary Conditions
None (e.g. isolated macromolecules)
Cubic periodic boundaries
Orthorhombic periodic boundaries
Parallelepiped periodic boundaries
Truncated octahedral periodic
boundaries
Rhombic dodecahedral periodic
boundaries
Slabs (i.e. x,y periodic, z nonperiodic)
Computational Science and Engineering
Daresbury Laboratory
Algorithms and Ensembles
Algorithms Ensembles
Verlet leapfrog NVE
RD-SHAKE Berendsen NVT
Euler-Quaternion* Hoover NVT
QSHAKE* Evans NVT
[All combinations] Berendsen NPT
Hoover NPT
* Not in DL_POLY_3 Berendsen NT
Hoover NT
Computational Science and Engineering
Daresbury Laboratory
DL_POLY_2&3 Differences
Rigid bodies not in _3
MSD not in _3
Tethered atoms not in _3
Standard Ewald not in _3
HK_Ewald not in _3
DL_POLY_2 I/O files work in _3 but NOT vice
versa
No multiple timestep in _3
No potential of mean force in _3
Computational Science and Engineering
Daresbury Laboratory
DL_POLY_2 Spin-Offs
DL_MULTI - Distributed multipoles
DL_PIMD - Path integral (ionics)
DL_HYPE - Rare event simulation*
DL_POLY - Symplectic version*
* Under development
Computational Science and Engineering
Daresbury Laboratory
The DL_POLY Java GUI
Computational Science and Engineering
Daresbury Laboratory
The DL_POLY Java GUI
Works on any platform (Java 1.3)
Allows visualisation/analysis of
DL_POLY input and output files
Input file generation features
Force field builders (ongoing)
Can be used for job submission
Extendable by user
Computational Science and Engineering
Daresbury Laboratory
DL_POLY People
Bill Smith DL_POLY_2 & _3 & GUI
[email protected]
Ilian Todorov DL_POLY_3
[email protected]
Ian Bush DL_POLY optimisation
[email protected]
Maurice Leslie DL_MULTI
[email protected]
Computational Science and Engineering
Daresbury Laboratory
Information
http://www.cse.clrc.ac.uk/msi/software
/DL_POLY/index.shtml
W. Smith and T.R. Forester,
J. Molec. Graphics, (1996), 14, 136
W. Smith, C.W. Yong, P.M. Rodger,
Molecular Simulation (2002), 28, 385
Computational Science and Engineering
Daresbury Laboratory
Part 2
Molecular Dynamics on
Parallel Computers
Computational Science and Engineering
Daresbury Laboratory
Criteria for Assessing Parallel
Strategies
Load Balancing:
Sharing work equally between processors
Sharing memory requirement equally
Maximum concurrent use of each processor
Communication:
Maximum size of messages passed
Minimum number of messages passed
Local versus global communications
Asynchronous communication
Computational Science and Engineering
Daresbury Laboratory
Scaling in Parallel Computing
Type 1 Scaling
Performance scaling with problem size
Type 2 Scaling
Performance scaling with number of
processors
Computational Science and Engineering
Daresbury Laboratory
Performance Parameters
Time required per step:
Ts =Tp+Tc
where
Ts is the time per step
Tp is the processing (computation)
time/step
Tc is the communication time/step
Computational Science and Engineering
Daresbury Laboratory
Performance Parameters
Can also write:
Ts =Tp(1+Rcp)
where:
Rcp= Tc/Tp
Rcp is the Fundamental Ratio*
(*NB Assume synchronous communications)
Computational Science and Engineering
Daresbury Laboratory
Initialize
Key Stages in MD Simulation
Set up initial system
Forces
Calculate atomic forces
Motion Calculate atomic motion
Calculate physical properties
Statistics
Repeat !
Summarize Produce final summary
Computational Science and Engineering
Daresbury Laboratory
Basic MD Parallelization
Strategies
Computing Ensemble (Cloning)
Hierarchical Control (Master-Slave)
Replicated Data*
Systolic Loops
Domain Decomposition*
Computational Science and Engineering
Daresbury Laboratory
Replicated Data
Initialize Initialize Initialize Initialize
Forces Forces Forces Forces
Motion Motion Motion Motion
Statistics Statistics Statistics Statistics
Summary Summary Summary Summary
Computational Science and Engineering
Daresbury Laboratory
Replicated Data MD Algorithm
Features:
Each node has copy of all atomic coordinates
(Ri,Vi,Fi)
Force calculations shared equally between
nodes (i.e. N(N-1)/2P pair forces per node).
Atomic forces summed globally over all nodes
Motion integrated for all or some atoms on each
node
Updated atom positions circulated to all nodes
Computational Science and Engineering
Daresbury Laboratory
Computational Science and Engineering
Daresbury Laboratory
Computational Science and Engineering
Daresbury Laboratory
Replicated Data MD Algorithm
Advantages:
Simple to implement
Good load balancing
Highly portable programs
Suitable for complex force fields
Good type 1 scaling
Dynamic load balancing possible
Computational Science and Engineering
Daresbury Laboratory
Replicated Data MD Algorithm
Disadvantages:
High communication overhead
Sub-optimal type 2 scaling
Large memory requirement
Unsuitable for massive parallelism
Computational Science and Engineering
Daresbury Laboratory
Link Cell Algorithm
Computational Science and Engineering
Daresbury Laboratory
Domain Decomposition MD
A B
C D
Computational Science and Engineering
Daresbury Laboratory
Domain Decomposition MD
Features:
Short range potential cut off (rcut << Lcell)
Spatial decomposition of atoms into
domains
Map domains onto processors
Use link cells in each domain
Pass border link cells to adjacent processors
Calculate forces, solve equations of motion
Re-allocate atoms leaving domains
Computational Science and Engineering
Daresbury Laboratory
Computational Science and Engineering
Daresbury Laboratory
Computational Science and Engineering
Daresbury Laboratory
Domain Decomposition MD
Advantages:
Good load balancing
Good type 1 scaling
Ideal for huge systems (10 ~ 10 atoms)
Simple communication structure
Fully distributed memory requirement
Dynamic load balancing possible
Computational Science and Engineering
Daresbury Laboratory
Domain Decomposition MD
Disadvantages
Problems with mapping/portability
Sub-optimal type 2 scaling
Requires short potential cut off
Complex force fields tricky
Computational Science and Engineering
Daresbury Laboratory
Part 3
DL_POLY: Under the Hood
Computational Science and Engineering
Daresbury Laboratory
Itinerary
Parallel force calculation
The parallel SHAKE algorithms
The Ewald summation
The SPME algorithm
Computational Science and Engineering
Daresbury Laboratory
Parallel Force Calculation
Bonded forces:
Algorithmic decomposition
Interactions managed by bookkeeping
arrays i.e. explicit bond definition
Shared bookkeeping arrays
Nonbonded forces:
Distributed Verlet neighbour list (pair
forces)
Link cells (3,4-body forces)
Implementation different in dl_poly_2&3!
Computational Science and Engineering
Daresbury Laboratory
Scheme for Distributing Bond
Forces
A2 A4 A6 A8 A10 A12 A14 A16
A1 A3 A5 A7 A9 A11 A13 A15 A17
P0 P1 P2 P3 P4 P5 P6 P7 P8
Computational Science and Engineering
Daresbury Laboratory
DL_POLY_3 and Bond Forces
P0Local
atomic
Global atomic indices
Processor Domains
indices
P1Local
Force field Expensive! atomic
definition
indices
P2Local
atomic
indices
Computational Science and Engineering
Daresbury Laboratory
Nonbonded Forces: RD Verlet
List
P0 1,2 1,3 1,4 1,5 1,6 1,7
2,3 2,4 2,5 2,6 2,7 2,8
3,4 3,5 3,6 3,7 3,8 3,9
4,5 4,6 4,7 4,8 4,9 4,10
P0 5,6 5,7 5,8 5,9 5,10 5,11 Distributed
6,7 6,8 6,9 6,10 6,11 6,12
list!
7,8 7,9 7,10 7,11 7,12
8,9 8,10 8,11 8,12 8,1
P0 9,10 9,11 9,12 9,1 9,2
10,1110,12 10,1 10,2 10,3
11,12 11,1 11,2 11,3 11,4
12,1 12,2 12,3 12,4 12,5
Computational Science and Engineering
Daresbury Laboratory
The SHAKE Algorithm
r r ' t 2 r r r ' t 2 r
r1 = r1 + G12 r2 = r2 + G21
m1 m2
1 r r r' r' r 1 1
1 d112 r1 r2 = r1 r2 + t G12 +
2
m1 m2
2 2 r r t 2 r r r0
d12 = d12 +
'
g12 d120
G12 = g12 d12
12
t 2 r0 r'
do12 d =d +2
2 '2
g12 d12 d12 + O( g122 )
12
12 12
1 2
G12 G21 12 (d122 d12'2 )
g12 r0 r'
2t d12 d12
2
Computational Science and Engineering
Daresbury Laboratory
Parallel SHAKE
MU1
MU2
MU3
MU4
Computational Science and Engineering
Daresbury Laboratory
Parallel SHAKE
Proc A Proc B
Shared atom
Computational Science and Engineering
Daresbury Laboratory
Parallel SHAKE
Initialisation:
Distribute molecular topology
Identify shared atoms
Reallocate bonds to reduced shared
atoms
Computational Science and Engineering
Daresbury Laboratory
Parallel SHAKE
5
13
22
13
list_shake
22
Distributing the
list_me list_in
molecular topology
Computational Science and Engineering
Daresbury Laboratory
Parallel SHAKE
Distributing the molecular topology
A
H 7 1 B
G 6 2 C
5 3
D
F 4
E
Replicated data Domain decomposition
Computational Science and Engineering
Daresbury Laboratory
The Ewald Summation
2
1 exp( k / 4 )
2 2 r
q j exp( ik rj ) +
N
rr
Uc =
2 V o k 0 k 2
j =1
r r
1
( )
N qn q j
r r
r r erfc Rl + rjn
4 o Rl = 0 n < j Rl + rjn
N
q 2j
4 3 / 2 o j =1
r 2
with: k = 1/ 3 ( l, m, n)
V
Computational Science and Engineering
Daresbury Laboratory
Parallel Ewald Summation
Self interaction correction - as is.
Real Space terms:
Handle using parallel Verlet
neighbour list
For excluded atom pairs replace erfc
by -erf
Reciprocal Space Terms:
Distribute over atoms
Computational Science and Engineering
Daresbury Laboratory
N/P r
Partition over q j exp( ik rj )
j =1 p0 Global Sum
atoms: 2( N / P ) r 2
q j exp( ik rj )
r
q j exp( ik rj )
N
j = N / P +1 p1 j =1
3( N / P ) r
q j exp( ik rj ) exp( k 2 / 4 2 )
r
q j exp( ik rj )
N j = 2 ( N / P ) +1
p2 k2
j =1 4( N / P ) r
q j exp( ik rj )
j = 3( N / P ) +1 p3
Repeat for each
k vector Add to Ewald
sum on all
r
q j exp( ik rj )
N
processors
j = N N / P +1 pP
Computational Science and Engineering
Daresbury Laboratory
Smoothed Particle-Mesh Ewald
Ref: Essmann et al., J. Chem. Phys. (1995) 103 8577
The cru cial part of the SPME m ethod is the conversion
of the Recip rocal Space com ponent of the Ew ald su m
into a form su itable for Fast Fourier Transform s (FFT).
Thus:
r 2
exp(k / 4 )
( )
1 2 2 N r
U recip =
2V o k 0
r r
k 2
j =1
q j exp ik rj
becom es:
1
U recip = G
2V o k1 ,k2 ,k3
T
(k1 , k2 , k3 )Q(k1 , k2 , k3 )
w here G and Q are 3D grid arrays (see later)
Computational Science and Engineering
Daresbury Laboratory
SPME: Spline Scheme
Central id ea - share d iscrete charges on 3D grid :
Cardinal B-Splines Mn(u) - in 1D:
exp(2iu j k / L ) b(k ) M n (u j l) exp(2ikl / K )
l =
1
n2
b(k ) = exp(2i (n 1)k / K ) M n (l + 1) exp(2ikl / K )
l =0
1 n
n!
M n (u ) =
(n 1)! k =0
( 1) k
k!(n k )!
max(u k ,0) n 1
u nu Recursion
M n (u ) = M n 1 (u ) + M n 1 (u 1)
n 1 n 1 relation
Computational Science and Engineering
Daresbury Laboratory
SPME: Building the Arrays
Q (l 1 , l 2 , l 3 ) =
N
q M
j =1
j
n1 , n2 , n3
n (u1 j l 1 n1 K1 ) M n (u 2 j l 2 n2 K 2 ) M n (u3 j l 3 n3 K 3 )
Is the charge array and QT(k1,k2,k3) its discrete Fourier transform.
GT (k1,k2,k3) is the discrete Fourier Transform of the
function:
exp( k 2 / 4 2 )
G (k1 , k 2 , k3 ) = B ( k1 , k 2 , k 3 )(Q T
( k1 , k 2 , k 3 )) *
k2
2 2 2
with B(k1 , k 2 , k3 ) = b1 (k1 ) b2 (k 2 ) b3 (k3 )
Computational Science and Engineering
Daresbury Laboratory
SPME: Comments
SPME is generally faster then conventional Ewald sum
in most applications. Algorithm scales as O(NlogN)
In DL_POLY_2 the FFT array is built in pieces on each
processor and made whole by a global sum for the FFT
operation
In DL_POLY_3 the FFT array is built in pieces on each
processor and kept that way for the distributed FFT
operation (DAFT)
The DAFT FFT `hides all the implicit communications
Computational Science and Engineering
Daresbury Laboratory
Part 4
DL_POLY on HPCx
Computational Science and Engineering
Daresbury Laboratory
HPCx Miscellanea
Register on-line:
Need project code & password from PI
http://www.hpcx.ac.uk/projects/new_users/
Register, then PI sends notification
Get userid & password from HPCx website
Login
ssh -l userid -X login.hpcx.ac.uk
Tools
emacs, vi, loadleveller...
Computational Science and Engineering
Daresbury Laboratory
Compiling DL_POLY on HPCx
Copy Makefile from `build to`source
Use `make hpcx to compile
Executable in `execute directory
DLPOLY.X is DL_POLY_2 executable
DLPOLY.Y is DL_POLY_3 executable
Standard executables available in
/usr/local/packages/dlpoly/DL_POLY_2/execute/DLPOLY.X
/usr/local/packages/dlpoly/DL_POLY_3/execute/DLPOLY.Y
Computational Science and Engineering
Daresbury Laboratory
Running DL_POLY on HPCx
Script:
#@ shell = /bin/ksh #@ wall_clock_limit = 00:59:00
# #@ account_no = xxxx
#@ job_type = parallel #
#@ job_name = gopoly #@ output = $(job_name).$(schedd_host).$(jobid).out
# #@ error = $(job_name).$(schedd_host).$(jobid).err
#@ tasks_per_node = 8 #@ notification = never
#@ node = 8 #
# #@ queue
#@ node_usage = not shared #
#@ network.MPI = csss,shared,US export MP_SHARED_MEMORY=yes
# poe ./DLPOLY.Y
Computational Science and Engineering
Daresbury Laboratory
Running DL_POLY on HPCx
Job submission:
llsubmit job_script
Job status:
llq -u user_id
Job cancel:
llcancel job_id
Computational Science and Engineering
Daresbury Laboratory
DL_POLY Directories
build Home of makefiles
source DL_POLY source code
DL_POLY Home of executable &
execute
Working Directory
java Java GUI source code
utility Utility codes
Computational Science and Engineering
Daresbury Laboratory
DL_POLY I/O Files
REVCON
CONFIG
OUTPUT
CONTROL HISTORY*
FIELD STATIS*
TABLE* RDFDAT*
ZDNDAT*
REVOLD*
REVIVE
Computational Science and Engineering
Daresbury Laboratory
DL_POLY_3 on HPCx
Test case 1 (552960 atoms, 300t)
NaKSi2O5 - disilicate glass
SPME (1283grid)+3 body terms, 15625 LC)
32-512 processors (4-64 nodes)
Test case 2 (792960 atoms, 10t)
64xGramicidin(354)+256768 H2O
SHAKE+SPME(2563 grid),14812 LC
16-256 processors (2-32 nodes)
Computational Science and Engineering
Daresbury Laboratory
DL_POLY_3 on HPCx
Case 1
Computational Science and Engineering
Daresbury Laboratory
DL_POLY_3 on HPCx
Case 1
Computational Science and Engineering
Daresbury Laboratory
DL_POLY_3 on HPCx
Case 1
Computational Science and Engineering
Daresbury Laboratory
DL_POLY_3 on HPCx
Case 2
Computational Science and Engineering
Daresbury Laboratory
DL_POLY_3 on HPCx
Case 2
Computational Science and Engineering
Daresbury Laboratory
DL_POLY_3 on HPCx
Case 2
Computational Science and Engineering
Daresbury Laboratory
Part 5
The DL_POLY Java GUI
Computational Science and Engineering
Daresbury Laboratory
GUI Overview
Java - Free!
Facilitate use of code
Selection of options (control of capability)
Construct (model) input files
Control of job submission
Analysis of output
Portable and easily extended by user
Computational Science and Engineering
Daresbury Laboratory
GUI Appearance
Menus
Graphics
Buttons
Graphics
Window
Monitor
Window
Computational Science and Engineering
Daresbury Laboratory
Invoking the GUI
Invoke the GUI from within the
execute directory (or equivalent):
java -jar ../java/GUI.jar
Colour scheme options:
java -jar ../java/GUI.jar -colourscheme
with colourscheme one of:
monet, vangoch, picasso, cezanne,
mondrian (default picasso).
Computational Science and Engineering
Daresbury Laboratory
Compiling/Editing the GUI
Edit source in java directory
Edit using vi,emacs, whatever
Compile in java directory:
javac *.java
jar -cfm GUI.jar manifesto *.class
Executable is GUI.jar
Computational Science and Engineering
Daresbury Laboratory
Menus Available
File - Simple file manipulation, exit etc.
FileMaker - make input files:
CONTROL, FIELD, CONFIG, TABLE
Execute
Select/store input files, run job
Analysis
Static, dynamic,statistics,viewing,plotting
Information
Licence, Force Field files, disclaimers etc
Computational Science and Engineering
Daresbury Laboratory
Using the Menus
Computational Science and Engineering
Daresbury Laboratory
A Typical GUI Panel
Buttons
Checkbox
Text Boxes
Computational Science and Engineering
Daresbury Laboratory
Part 6
DL_POLY Hands-On
Computational Science and Engineering
Daresbury Laboratory
Hands-On Session
This will consist of three components:
A demonstration of the Java GUI
Trying some DL_POLY simulations:
prepared exercises, or
creative play
DL_POLY clinic - whats up doc?
Computational Science and Engineering
Daresbury Laboratory
Hands-on Session Info
Invoke Netscape
Access:
http://www.cse.clrc.ac.uk/msi/
software/DL_POLY/COURSES/
Tutorial
Follow instructions therein!
Computational Science and Engineering