It is often stated that the cost of SCF calculations scales with size as , but the veracity of this assertion depends critically on what is meant by . For fixed molecule size, the cost may exhibit quartic scaling due to the growth in the number of ERIs, . However, what is more often intended in a discussion of the asymptotic cost of an electronic structure calculation is the scaling with respect to molecular size, for a fixed choice of basis set. This corresponds to measuring size () in terms of the number of atoms, in which case the asymptotic cost should be , as discussed below.
The success of quantitative modern quantum chemistry, relative to its
primitive, qualitative beginnings, can be traced to two sources: better
algorithms and better computers. While the two technologies continue to improve
rapidly, efforts are heavily thwarted by the fact that the total number of ERIs
increases quadratically with the size of the molecular system. Even large
increases in ERI algorithm efficiency yield only moderate increases in
applicability, hindering the more widespread application of ab initio
methods to areas of, perhaps, biochemical significance where semi-empirical
techniques
305
Org. Mass. Spect.
(1993),
28,
pp. 305.
Link
have already proven so valuable.
Thus, the elimination of quadratic scaling algorithms has been the theme of
many research efforts in quantum chemistry throughout the 1990s and has seen
the construction of many alternative algorithms to alleviate the problem.
Johnson was the first to implement DFT exchange/correlation functionals whose
computational cost scaled linearly with system size.
This paved the way for the most significant breakthrough in the area with the linear
scaling CFMM algorithm
1360
Chem. Phys. Lett.
(1994),
230,
pp. 8.
Link
leading to linear scaling DFT
calculations.
1362
Chem. Phys. Lett.
(1996),
253,
pp. 268.
Link
Further breakthroughs have been made with
traditional theory in the form of the
QCTC
217
Modern developments in hartree-fock theory: fast methods for computing the coulomb matrix
(1996),
1,
pp. 53.
Link
,
216
J. Chem. Phys.
(1996),
104,
pp. 4685.
Link
,
218
J. Chem. Phys.
(1997),
106,
pp. 5526.
Link
and
ONX
1139
J. Chem. Phys.
(1996),
105,
pp. 2726.
Link
,
1140
J. Chem. Phys.
(1996),
106,
pp. 9708.
Link
algorithms, while more radical
approaches
23
Chem. Phys. Lett.
(1996),
254,
pp. 329.
Link
,
328
J. Phys. Chem.
(1996),
100,
pp. 6272.
Link
may lead to entirely new approaches
to ab initio
calculations. Investigations into the quadratic Coulomb problem has not only
yielded linear scaling algorithms, but is also providing large insights into
the significance of many molecular energy components.
Linear scaling Coulomb and SCF exchange/correlation algorithms are not the end
of the story as the diagonalization step has been rate limiting
in semi-empirical techniques and, been predicted to become
rate limiting in ab initio approaches in the medium term.
1229
J. Chem. Phys.
(1995),
102,
pp. 8448.
Link
However, divide-and-conquer techniques
1410
Phys. Rev. A
(1991),
44,
pp. 7823.
Link
,
1411
Phys. Rev. Lett.
(1991),
66,
pp. 1438.
Link
,
1409
J. Chem. Phys.
(1995),
103,
pp. 5674.
Link
,
750
J. Chem. Phys.
(1996),
105,
pp. 2744.
Link
and the recently developed quadratically convergent SCF algorithm
941
Chem. Phys. Lett.
(1997),
270,
pp. 399.
Link
show great promise for reducing this problem.
No single ERI algorithm is available to efficiently handle all integral
classes; rather, each tends to have specific integral classes where the
specific algorithm outperforms the alternatives. The PRISM algorithm
419
Int. J. Quantum Chem.
(1991),
40,
pp. 753.
Link
is an intricate collection of pathways and steps in which the
path chosen is that which is the most efficient for a given class. It appears
that the most appropriate path for a given integral class depends on the
relative position of the contraction step (lowly contracted bra-kets prefer
late contraction, highly contracted bra-kets are most efficient with early
contraction steps).
Careful studies have provided FLOP counts which are the current basis of
integral algorithm selection, although care must be taken to ensure that
algorithms are not rate limited by MOPs.
380
Chem. Phys. Lett.
(1993),
206,
pp. 225.
Link
Future algorithm
selection criteria will take greater account of memory, disk, chip
architecture, cache size, vectorization and parallelization characteristics of
the hardware, many of which are already exist within Q-Chem.
Q-Chem combines the Head-Gordon–Pople (HGP) method
504
J. Chem. Phys.
(1988),
89,
pp. 5777.
Link
and the
COLD prism method
22
J. Chem. Phys.
(1997),
107,
pp. 124.
Link
for Hartree-Fock gradient and Hessian
evaluations. All two-electron four-center integrals are classified according
to their angular momentum types and degrees of contraction. For each type of
integrals, the program chooses one with a lower cost. In practice, the HGP
method is chosen for most integral classes in a gradient or Hessian
calculation, and thus it dominates the total CPU time.
Recently the HGP codes within Q-Chem were completely rewritten for the evaluation of the P II P term in the gradient evaluation, and the P II P term in the Hessian evaluation. Our emphasis is to improve code efficiency by reducing cache misses rather than by reducing FLOP counts. Some timing results from a Hartree-Fock calculation on the azidothymidine (AZT) molecule are shown below.
Basis Set | AIX | Linux | ||||
---|---|---|---|---|---|---|
Gradient Evaluation: P II P Term | ||||||
Old | New | New/Old | Old | New | New/Old | |
3-21G | 34 s | 20 s | 0.58 | 25 s | 14 s | 0.56 |
6-31G** | 259 s | 147 s | 0.57 | 212 s | 120 s | 0.57 |
DZ | 128 s | 118 s | 0.92 | 72 s | 62 s | 0.86 |
cc-pVDZ | 398 s | 274 s | 0.69 | 308 s | 185 s | 0.60 |
Hessian Evaluation: P II P term | ||||||
Old | New | New/Old | Old | New | New/Old | |
3-21G | 294 s | 136 s | 0.46 | 238 s | 100 s | 0.42 |
6-31G** | 2520 s | 976 s | 0.39 | 2065 s | 828 s | 0.40 |
DZ | 631 s | 332 s | 0.53 | 600 s | 230 s | 0.38 |
cc-pVDZ | 3202 s | 1192 s | 0.37 | 2715 s | 866 s | 0.32 |