ATK Advanced Tutorial
Convergence Tips & Tricks
Contents
INTRODUCTION.............................................................................................................................. 3
MAGNETIC-TUNNEL JUNCTION GEOMETRY ................................................................................. 4
PARAMETER TUNING....................................................................................................................... 6
Basis set...................................................................................................................................... 6
K-point sampling......................................................................................................................... 6
Electron temperature................................................................................................................... 8
Electron density parameters ........................................................................................................ 8
Contour integration parameters .................................................................................................. 9
Exchange–correlation................................................................................................................ 11
Convergence criterion and mixing ............................................................................................ 11
Two-probe algorithm parameters .............................................................................................. 12
Summary .................................................................................................................................. 13
OTHER SETTINGS IN VNL .............................................................................................................. 14
SETTING THE INITIAL SPIN............................................................................................................ 15
RUNNING THE CALCULATION ..................................................................................................... 16
SUMMARY ...................................................................................................................................... 16
APPENDIX: CONTOUR INTEGRATION IN ATK............................................................................. 17
2
Introduction
The default numerical parameters in ATK are chosen such that the vast majority of two-probe systems
converge with very little effort. There are however certain system types that cannot be converged, or
which converge very slowly, unless specific parameters are tuned carefully, and it should be
acknowledged that the influence of these parameters on the convergence rate is not entirely well
documented in the manual.
The system which we shall base this tutorial on is a magnetic tunnel junctions (MTJs) with anti-parallel
polarized electrodes, which is among the most difficult types of systems to converge in ATK. Several
tricks, meaning specific parameter tuning, are needed in order to avoid the many possible pitfalls that
thwart the success of convergence for MTJs.
The techniques described below are however widely applicable, and so you will hopefully find
several tips and tricks that can help in the convergence of your specific system, whatever it might be.
We shall go through each parameter in detail, but first we shall consider the specific geometry of the
spin-polarized system which is our test system. Even the geometric setup of the electrodes contains
elements that influence the convergence rate of the two-probe calculation, thus it is recommended to
read through this part too, even if you are not interested in or even familiar with magnetic tunnel
junctions.
Before we start, we should also point out that our example takes up to 12 hours to compute in serial.
It does, however, parallelize very well, and we thus strongly recommend that you run this calculation
in parallel using MPI. This can reduce the calculation time down to a couple of hours only.
3
Magnetic-tunnel junction geometry
The system we will use as an example is a Fe/MgO/Fe magnetic tunnel junction. There is a detailed
and quite complicated description of how to create the geometry of this system in the ATK manual,
but since the release of Virtual NanoLab 2008.02 there is a much easier way, namely by using the
Magnetic-Tunnel Junction Builder.
Figure 1: The Virtual NanoLab main toolbar.
For our example, we will stick to the default geometric configuration suggested by this instrument.
Therefore, just open it by double-clicking it on the VNL main toolbar and leave all parameters at their
default values. You can always restore the default parameters by clicking the button “Default”.
Figure 2: The default geometry suggested by the Magnetic Tunnel
Junction Builder works fine for our example, but one should always
carefully consider both the number of electrode and surface layers.
Nevertheless, we already here actually encounter one possibility for improving the convergence rate.
By increasing the electrode depth from the default 4 layers to 6 or 8 (it must be even, since it is an fcc
[100] surface), one can expect better stability in the self-consistent loop. The effect is quite subtle, but
4
the consequence is the same as for many of the parameters to be discussed later, namely to improve
the accuracy of the determination of the Fermi level.
The underlying reason why 4 electrode layers may not be sufficient is related to an approximation, or
rather a simplification, in the two-probe algorithm used by ATK. During the electrode calculation, the
system is treated as a normal bulk system, and hence all interactions with neighboring atoms are
properly taken into account. The electrode calculation produces a particular Fermi level, one for each
electrode, used to align the chemical potential against the bias, etc.
During the two-probe calculation, however, it is for simplicity assumed that in the transport direction
there are no interactions that extend beyond the nearest-neighboring electrode cell. If, in reality, there
are such interactions, they are simply truncated from the Hamiltonian. In this case, the Hamiltonian
used in the two-probe calculation will be different from that used in the electrode calculation, and this
effectively amounts to an unnecessary approximation. If there are sharp features in the DOS around
the Fermi level, this difference can cause problems for the convergence, and may compromise the
accuracy of the results as well.
This is illustrated in the figure below. If the interaction range is such that atoms which are 3a apart
have a basis function overlap, the interactions marked by red (specifically, the corresponding matrix
elements in the Hamiltonian) will be included in the electrode calculation but not in the two-probe
calculation. All other interactions, some of which are indicated in blue below, will be included in
both calculations. Of course, the further apart the atoms are, the smaller the matrix element is, and
whether these contributions to the Hamiltonian have a significant influence on the calculation or not
varies from case to case.
interaction range
electrode cell
A second point to note is the number of surface layers. The default value in the MTJ Builder (that is, 4)
is really a minimal value. 6 or 8 would be better, especially for finite bias calculations. This is because
the surface layers must be able to provide sufficient screening of the interaction region for the bulk
electrodes in order for the two-probe method used in ATK to work properly.
As always, there is a downside as well, however, to increasing the number of electrode and surface
layers. Memory usage increases quite steeply with more atoms, especially when we use a lot of k-
points. Therefore, to save time for our example calculation, we stick with the default suggestion
anyway. The points discussed in this section should however always be considered when setting up
the two-probe geometry.
Once the geometry is defined, drag it from the MTJ builder to the NanoLanguage Scripter, using the
icon in the lower right corner.
5
Parameter tuning
In the NanoLanguage Scripter, click the Method tab.
In this example we are setting up the calculation where the electrodes have anti-parallel spin
polarization. If this converges properly, so does almost guaranteed the parallel polarization case, so
one should always start with the anti-parallel case!
Basis set
The first item in the drop-down list of parameter groups is Basis Set. For our test calculation you can
save a lot of time (and memory!) by using a smaller basis set for iron. Actually, results obtained with
SZP for iron are usually indistinguishable from DZP results. Therefore, tick the box “Specify Basis Set
by Element Type”, and locate Fe in the list, then add a tick mark before it, and select
SingleZetaPolarized as “Type”.
Figure 3: Specifying a smaller basis set for Fe in the NanoLanguage
Scripter.
If you decide to stick with DZP to get better accuracy, make sure you have enough memory,
especially if you increased the number of atoms in the system; you should benchmark the calculation
with SZP first anyway (or DZP but with fewer atoms), to determine how much memory is needed.
K-point sampling
Next in the list are the parameters for Brillouin Zone Integration, which are very important for any
system, and not least for FeMgO.
6
If you study the density of states (DOS) for a Fe/MgO interface, you will find a very sharp peak around
the Fermi level for the minority spin component. It is thus clear that a small inaccuracy in the
determination of the Fermi level in the calculation will give rise to a large change in the DOS, which
in turn will give inaccurate results – and quite possible hamper the convergence, especially if the
Fermi level wobbles from iteration to iteration.
Figure 4: Surface DOS for a Fe/MgO interface shows a sharp peak at the
Fermi level for the minority spin.
One of the most important parameters for obtaining an accurate Fermi level is the k-point sampling.
Increasing the number of k-points reduces charge oscillations in the initial 5–10 steps of the self-
consistent cycle, and improves the convergence in general.
We therefore recommend that in a system like this, one should have at least 10x10 k-points in the
transverse direction (called A and B in ATK) to the interfaces. This is especially important since the
unit cell is very small in the transverse plane. If it is increased, the k-point sampling can be reduced
accordingly.
In the transport direction (C), about 100 points is recommended. If you increase the number of
electrode layers to 6 or 8, this value can however be reduced, perhaps to 50 without compromising
the accuracy. This will be important to keep the memory consumption down.
For particular geometries, and especially for finite bias, it may be necessary to increase the k-point
sampling further, perhaps even as high as 18x18. There are some significant consequences of this that
should be considered.
• Naturally the computation time increases with the k-point sampling. Fortunately, ATK is
parallelized over the k-points so this can to a rather large degree be balanced by using more
parallel nodes in the calculation. Good scaling should be obtained for this system due to the
large number of k-points (and contour integration points; see below).
• Memory consumption also increases with more k-points. This increase can be quite steep, and
one should be careful not to start with too many points in the initial tests, but increase as
needed if convergence is not obtained, or to check the accuracy of the results. The electrode
7
calculation is the most memory-hungry part of this particular calculation; with 10x10x100 k-
points, and all other parameters chosen as described in this tutorial, the peak RAM usage is
less than 800 Mb. If one increases the number of electrode layers to 8, and the basis set to
DZP, and the k-point sampling to 18x18x100, say, one ends up at about 3 Gb of RAM instead.
• The number of k-points in the C direction only affects the electrode calculation, while the k-
points in the transverse plane are also used for the calculation of the surface Green’s function
in the two-probe part. Therefore, the A and B k-points have a stronger influence on the total
calculation time, but on the other hand this is precisely where ATK is parallelized.
The example system we are looking at there converges well with 10x10x100 k-points.
Electron temperature
Under the next parameter group, Eigenstate Occupation, we find the specification of the electron
temperature. This determines the broadening of the Fermi function in the electrodes. It is quite natural
to expect that by increasing the temperature, we reduce the effect of the sharp peak in the Fermi level
DOS.
It has been found that setting the electron temperature to somewhere between 1000 and 1300 K is
very beneficial for the convergence of the FeMgO system.
On the other hand, increasing the temperature this way changes the results a bit. A good strategy is
therefore to converge the system at a higher temperature and then use the converged system as a
starting guess for a new calculation at room temperature. We will however not cover this step in this
tutorial.
Let us use 1200 K in our example.
Electron density parameters
Skipping the electrode voltages (one should always converge the zero bias calculation first) we next
encounter the Electron Density parameters.
Testing shows that the default mesh cut-off is sufficiently accurate, although in order to obtain good
lattice parameters for Fe, one generally needs to increase the mesh cut-off (i.e. make the real-space
mesh for the Poisson equation finer) quite substantially. Energies, as we are looking for here, are
however less sensitive to the mesh than are forces and stress.
Therefore we stick to the default 150 Rydberg mesh cut-off.
The NanoLanguage scripter has no ability to set the initial spin individually for each atom. We can
however easily edit the script later. For now, set the radio button for spin to “Initial Scaled Spin”, and
tick “Heterogeneous”. Then enter 0.56 for the left initial spin and –0.56 for the right.
8
Figure 5: We temporarily set zero spin in the central region; we will
later edit (by hand) the spin for each individual atom in the central
region to match the electrodes.
The “Initial Scaled Spin” parameter sets the initial spin on each atom to a fraction of its atomic spin.
Usually the spin on an atom is smaller in the solid, compared to the isolated atom, thus, this
parameter should usually be less than 1 (it is always constrained to the interval [−1,1]). The value 0.56
is obtained by doing a bulk calculation for Fe. For further information, see the ATK manual section for
electronDensityParameters (from the main manual page, enter “KohnSham module” under the
section “Reference manual” and then click the relevant link).
Contour integration parameters
In order to treat electron transport under finite bias, ATK employs a method of contour integration in
the complex energy plane to avoid the poles on the real energy axis when computing the density
matrix from the Green’s function.
For most cases the default Contour Integration parameters work fine, but there are exceptions and the
present system is one. In those situations, it is necessary to understand a little bit about the contour
integration, although in reality it is sufficient to know how to tune the parameters, rather than grasp
the entire underlying mathematics. We shall anyway use the opportunity here to present some details
about the contour integration in ATK, at least as a technical reference, but is included as an appendix
in order not to break the presentation.
Based on that discussion, the parameters that may require tuning are:
• Integral Lower Bound: During the iterative cycle, individual poles of the Green’s function
may appear at very low energies, outside the contour. When this happens, charge will
disappear from the system which typically initiates an avalanche effect, pushing the poles even
further out, and even more charge is lost, until the system is totally void of electrons.
9
When this occurs, the calculation will actually converge trivially, and often in very few steps
(5–10). However, the result is obviously wrong, as indicated by the fact that the total number
of electrons in the central region, as reported by the value “q” during each step, becomes zero.
To avoid this problem one should increase the extent of the contour, viz. it’s lower bound on
the real axis. The default value of “Integral Lower Bound” is 3 Rydberg, but for FeMgO and
other difficult systems, a value between 5 and 10 Rydberg is recommended.
• Circle Points: As the size of the contour is increased, it is a good idea to correspondingly
increase the number of points on the contour to avoid losing in accuracy. Suggested values are
between 50 and 70 points (default 30).
• Real Axis Point Density: For finite bias calculations, our experience shows that it is beneficial
to decrease the “Real Axis Point Density” to say 0.005 eV (default 0.02 eV), as this determines
the number of integration points in the bias window. This parameter has no influence on the
zero-bias calculation, however.
• Real Axis Infinitesimal. It can sometimes be advantageous to increase this parameter, but one
should be aware that this also changes the physical results. The proper approach in this case is
to “anneal” the system, and converge it using the larger value, and then use this calculation as
an initial guess for a new calculation with the default (or even smaller) value.
Increasing the number of points on the contour has a direct effect on the calculation time. Again,
however, this is precisely where ATK is parallelized, so using more parallel nodes counters the effect
and the calculations should scale quite well to as many nodes as circle points, especially if the
number of transverse k-points is comparable to the number of circle points.
Figure 6: The contour integration parameters chosen for our example.
10
Exchange–correlation
For our example we will use the default Exchange–Correlation Functional ([Link]). This reduces the
memory consumption and the calculation time, although for real work one should at least double-
check the influence of switching to GGA on the physical results (the current, for instance).
Convergence criterion and mixing
Good convergence is ultimately a matter of a good mixing strategy. The default mixing quantity in
ATK is Hamiltonian mixing, and it is strongly recommended always to use this for two-probe systems.
To reduce the risk of charge oscillations, it is however a good idea to reduce the “Diagonal Mixing
Parameter” to 0.05 under Iteration Mixing parameters. Also, it often helps convergence to increase
the number of history steps, although the default works fine in this example.
Figure 7: Convergence and mixing parameters.
Regarding Iteration Control, the new convergence criterion “TotalEnergy” (default) introduced in ATK
2008.02 has meant that many systems converge up to 2 times faster than before. Or, rather, the self-
consistent iteration loop is terminated in half as many steps. With the old criterion (“Strict”, which is
still available), the calculation would run for a very long time to converge the very small off-diagonal
parts of the density matrix, even if these have a negligible effect on the physical observables such as
total energy and – ultimately – the transmission and hence current.
11
Two-probe algorithm parameters
The last parameter group is called TwoProbe Algorithm.
We recommend to always use the default electrode constraint (“Off”), since it gives more proper
physical behavior (specifically, symmetry) of the effective potential. In some cases, however, if
convergence is really difficult to obtain, one can attempt to first run (either to full convergence or for a
number of steps) with the constraint “RealSpaceDensity”, and then use the so-obtained density matrix
in the NetCDF file as a starting guess for a new calculation with the constraint “Off”.
The final parameter to tune is the “Initial Density Type”. For homogeneous two-probe systems where
the left and right electrodes are the same, the “EquivalentBulk” option is typically superior. However,
for heterogeneous setups, either geometrically (when e.g. the left and right electrodes are physically
different) or parameter-wise (as in our case, at least in the anti-parallel spin configuration), the
“NeutralAtom” option is really the only way to go.
Figure 8: For heterogeneous two-probe systems one should always
choose the NeutralAtom initial density type.
This means, that the density matrix starts out as a diagonal matrix with all occupied valence orbitals
having unit weight. This is usually not a very good guess for atoms in a crystal lattice or a molecule,
so a two-probe calculation, which is very sensitive to details, can fail to converge even just because of
this. This is why the “EquivalentBulk” is the better choice generally, since it first converges the two-
probe system as a bulk system, and then uses the so-obtained density matrix as the initial guess for the
two-probe calculation. However, if the two-probe geometry is not periodic along the transport
direction, this approach will almost certainly fail.
Thus, we keep the default “Off” constraint but switch to “NeutralAtom” initial density.
12
Summary
To summarize, the following parameters are modified compared to default:
• Basis Set
o Type (for Fe) = SingleZetaPolarized
• Brillouin Zone Integration
o Number of k-points (A/B/C) = 10/10/100
• Eigenstate Occupation
o Electron Temperature = 1200 Kelvin
• Electron Density
o Heterogeneous (check)
o Initial Scaled Spin = 0.56 (left) and –0.56 (right)
• Energy Contour
o Circle Points = 50
o Integral Lower Bound = 7 Rydberg
o Real Axis Points Density = 0.005 eV (for finite bias)
• Iteration Mixing
o Diagonal mixing parameter = 0.05
• TwoProbe Algorithm
o Initial Density = NeutralAtom
In addition, choosing a sufficient number of electrode layers, as well as surface layers, is very
important.
13
Other settings in VNL
The last – and crucially important! – setting we need to attend to is the checkpoint file (on the Self-
Consistent Calculation tab). Make sure the check “Specify the checkpoint file”, and give it an
appropriate name.
Figure 9: Choose either 1 or 10 for verbosity.
It is often more convenient to inspect the progress of the convergence when the verbosity level is set
to 1. On the other hand, using 10 prints the Mulliken populations for each step, which can help
diagnose convergence problems. Choose whichever value you prefer.
We will not perform any analysis. Once the two calculations, the parallel and the anti-parallel, have
converged, we can always restore the checkpoint files and calculate the quantities of interest. This,
however, is a separate tutorial session!
14
Setting the initial spin
Before we can run the calculation we need to set the initial spin for each atom in the central region.
To do this, use the icon in the bottom right corner to drag the NanoLanguage script to the Script
Editor on the VNL main toolbar, or drag it to your favorite external editor. You can also save the
script first (by using the “Save As” button) and open it in an external editor.
Locate the line that specifies the initial_scaled_spin on the electronDensityParameters for the central
region (see figure below).
Figure 10: Using the Script Editor in VNL to set the initial scaled spin.
There are 16 entries in the list, corresponding to the 16 atoms in the central region; 4 Fe layers on the
left surface, 4 MgO layers in the middle (8 atoms) and 4 Fe surface layers to the right. (If you increased
the number of surface atoms, you must adjust the instructions here to match your choice.) To match
the electrode polarizations, the Fe surface atoms should be set to the same initial spin as their
corresponding electrode atoms (left/right). Therefore, modify the code to read
electron_density_parameters = electronDensityParameters(
mesh_cutoff = 150.0*Rydberg,
initial_scaled_spin = [ 0.56, 0.56, 0.56, 0.56,
0., 0., 0., 0., 0., 0., 0., 0.,
-0.56, -0.56, -0.56, -0.56 ] )
)
Save the script, calling it for instance “FeMgOFe_antipara.py”.
While we are at it, we may as well create the script for the parallel calculation too. To do this, only
the following parts of the script need to be modified:
• The initial scaled spin
o for the central region (which we just edited above); instead of –0.56 it should be 0.56
also for the right surface Fe atoms.
o for the right electrode; instead of –0.56 it should be 0.56 for all atoms.
• The checkpoint_filename (at the very bottom of the script)
15
• In case you included any analysis features, also make sure to use individual filenames for the
VNL file.
Save the script, calling it for instance “FeMgOFe_para.py”. Remember to click “Save As”, not “Save”,
if you are using the Script Editor, otherwise you overwrite the anti-parallel script!
Running the calculation
As mentioned above, this calculation should really by run in parallel. The system shows nearly linear
speedup with the number of processors and can thus be many times faster, depending on the number
of processors. However, it is also possible to run it in serial; it takes around 12 hours on a “typical”
dual-core laptop running Windows, and does not use more than 1 Gb of RAM.
Convergence should be obtained in 71 iterations if you have used exactly the same parameters as
above. It is not impossible that further fine-tuning of the parameters can reduce this number, so feel
free to experiment, and do let us know if you manage to bring the number of iterations!
Summary
This concludes this tutorial, which focused on identifying and tuning some of the parameters that are
important to control in order to converge “difficult” systems. We hope that our users are able to put
this knowledge to good use, both for MTJ systems and other geometries, and that it can help them be
more productive and spend less time struggling with obtaining convergence and more time producing
data, and analyzing and publishing their results!
If you have any questions of comments on this tutorial, or ATK and VNL in general, please do not
hesitate to contact us!
16
Appendix: Contour integration in ATK
One of the most fundamental steps in the calculation of a two-probe system is the determination of
the density matrix D. In ATK, this is done by integrating the Green’s function G (which is a matrix,
once it is expanded in the numerical orbital basis set) along the real energy axis,
1 ⎡ ⎤
∞
D = − Im⎢ ∫ dε G (ε + iδ) nF (ε − μ)⎥ ,
π ⎣− ∞ ⎦
where δ is a small real infinitesimal,
(
nF ( ε ) = e ε / k B T + 1) −1
is the Fermi function, μ is the chemical potential, and T is the electron temperature.
To carry out this integral numerically requires a very large amount of integration points since the
Green’s function has poles on the real axis. Since it is time-consuming to evaluate the Green’s
function for each energy ε, this becomes very inefficient, and there is a risk of poor accuracy.
Instead one can perform the integration in the complex energy plane and use Cauchy’s residual
theorem. The closed contour integral over
• L = line from ε=∞+iΔ to ε=μ−γ+iΔ
• C = arc from ε=μ−γ+iΔ to ε=μ−λ
• M =line from ε=μ−λ to ε=∞
• N = line from ε=∞ to ε=∞+iΔ (however, the integrand is zero on this path)
encloses the poles of the Fermi function at ε=zν=(2ν+1)πkBT for integer ν.
μ−λ μ−γ μ M
Thus,
∫ dz G ( z ) n F ( z − μ) = −2πik BT ∑ G ( zν ) ,
ν
where we have used that the residues of nF are −kBT. We thus obtain
∞
∫ dε G (ε + iδ) n F (ε − μ ) = ∫ dz G ( z) n F ( z ) − 2πik BT ∑ G ( zν ) .
μ −λ C+L ν
The right-hand side can be evaluated numerically by choosing a particular number of Fermi poles zν
to include (indicated by the filled red circles in the figure above; the empty circles are Fermi poles
17
outside the closed contour), in order that the contour is sufficiently far away from the real axis. This
ensures that the Green’s function is smooth on C and L, and we can perform the integration by using a
quadrature with a small number of points. The main variation of the integrand on L comes from the
Fermi function, and therefore it is advantageous to use nF as a weight function in the quadrature.
It is important to point out that the integral on the left-hand side of the equation above is not exactly
the original one, which was running from −∞, while our numerical integral for practical reasons has a
finite end-point determined by λ. As long as this point is below the bottom valence band edge, there
are no further poles in the integrand, and thus the expression is exact. If, on the other hand, the value
of λ is too small, and there are poles outside the contour, the corresponding poles will not be
included in the integration. This causes charge to erroneously disappear from the system, which
typically initiates an avalanche effect, pushing the poles even further out, and even more charge is lost,
until the system is totally void of electrons.
When this occurs, the calculation will actually converge trivially, but to a physically incorrect solution.
Careful tuning of λ is therefore paramount. The default value is typically good enough, but in some
cases it has to be increased to avoid this effect.
The geometry of the contour can be controlled by the user through a number of input parameters to
ATK:
• circle_points, the number of points used in the quadrature over C.
• integral_lower_bound, corresponds to λ.
• fermi_line_points, the number of points in the quadrature over L. Do not change this value
from the default 10!
• fermi_function_poles, how many poles zν to include. This indirectly determines Δ, and the
default value (4) should be fine in virtually all cases.
• real_axis_infinitesimal, corresponds to δ.
• real_axis_point_density, the spacing between the energy points in the integration over the
bias window around μ close to the real axis. This parameter is not used in the zero-bias
calculation.
• The parameter γ can not be adjusted by the user, but the internal pre-defined value should be
fine in all situations.
For a discussion of the implications of these parameters on convergence, see the main text.
The picture above applies only to the zero-bias case. For more information, especially on the non-
equilibrium situation, see M. Brandbyge et al., PRB 65, 165401 (2002).
18