X

Search Results

Searching....

A.2 Calculating Electron Repulsion Integrals (ERIs)

A.2.4 Efficiency of ERI Evaluation

(November 19, 2024)

It is often stated that the cost of SCF calculations scales with size as 𝒪(N4), but the veracity of this assertion depends critically on what is meant by N. For fixed molecule size, the cost may exhibit quartic scaling due to the growth in the number of ERIs, (μν|λσ). However, what is more often intended in a discussion of the asymptotic cost of an electronic structure calculation is the scaling with respect to molecular size, for a fixed choice of basis set. This corresponds to measuring size (N) in terms of the number of atoms, in which case the asymptotic cost should be 𝒪(N2), as discussed below.

A.2.4.1 Quadratic Scaling

The success of quantitative modern quantum chemistry, relative to its primitive, qualitative beginnings, can be traced to two sources: better algorithms and better computers. While the two technologies continue to improve rapidly, efforts are heavily thwarted by the fact that the total number of ERIs increases quadratically with the size of the molecular system. Even large increases in ERI algorithm efficiency yield only moderate increases in applicability, hindering the more widespread application of ab initio methods to areas of, perhaps, biochemical significance where semi-empirical techniques 305 Dewar M. J. S.
Org. Mass. Spect.
(1993), 28, pp. 305.
Link
have already proven so valuable.

Thus, the elimination of quadratic scaling algorithms has been the theme of many research efforts in quantum chemistry throughout the 1990s and has seen the construction of many alternative algorithms to alleviate the problem. Johnson was the first to implement DFT exchange/correlation functionals whose computational cost scaled linearly with system size. This paved the way for the most significant breakthrough in the area with the linear scaling CFMM algorithm 1360 White C. A. et al.
Chem. Phys. Lett.
(1994), 230, pp. 8.
Link
leading to linear scaling DFT calculations. 1362 White C. A. et al.
Chem. Phys. Lett.
(1996), 253, pp. 268.
Link
Further breakthroughs have been made with traditional theory in the form of the QCTC 217 Challacombe M. et al.
Modern developments in hartree-fock theory: fast methods for computing the coulomb matrix
(1996), 1, pp. 53.
Link
, 216 Challacombe M., Schwegler E., Almlöf J.
J. Chem. Phys.
(1996), 104, pp. 4685.
Link
, 218 Challacombe M., Schwegler E.
J. Chem. Phys.
(1997), 106, pp. 5526.
Link
and ONX 1139 Schwegler E., Challacombe M.
J. Chem. Phys.
(1996), 105, pp. 2726.
Link
, 1140 Schwegler E., Challacombe M.
J. Chem. Phys.
(1996), 106, pp. 9708.
Link
algorithms, while more radical approaches 23 Adamson R. D., Dombroski J. P., Gill P. M. W.
Chem. Phys. Lett.
(1996), 254, pp. 329.
Link
, 328 Dombroski J. P., Taylor S. W., Gill P. M. W.
J. Phys. Chem.
(1996), 100, pp. 6272.
Link
may lead to entirely new approaches to ab initio calculations. Investigations into the quadratic Coulomb problem has not only yielded linear scaling algorithms, but is also providing large insights into the significance of many molecular energy components.

Linear scaling Coulomb and SCF exchange/correlation algorithms are not the end of the story as the 𝒪(N3) diagonalization step has been rate limiting in semi-empirical techniques and, been predicted to become rate limiting in ab initio approaches in the medium term. 1229 Strout D. L., Scuseria G. E.
J. Chem. Phys.
(1995), 102, pp. 8448.
Link
However, divide-and-conquer techniques 1410 Yang W.
Phys. Rev. A
(1991), 44, pp. 7823.
Link
, 1411 Yang W.
Phys. Rev. Lett.
(1991), 66, pp. 1438.
Link
, 1409 Yang W., Lee T.-S.
J. Chem. Phys.
(1995), 103, pp. 5674.
Link
, 750 Lee T.-S., York D. M., Yang W.
J. Chem. Phys.
(1996), 105, pp. 2744.
Link
and the recently developed quadratically convergent SCF algorithm 941 Ochsenfeld C., Head-Gordon M.
Chem. Phys. Lett.
(1997), 270, pp. 399.
Link
show great promise for reducing this problem.

A.2.4.2 Algorithm Selection

No single ERI algorithm is available to efficiently handle all integral classes; rather, each tends to have specific integral classes where the specific algorithm outperforms the alternatives. The PRISM algorithm 419 Gill P. M. W., Pople J. A.
Int. J. Quantum Chem.
(1991), 40, pp. 753.
Link
is an intricate collection of pathways and steps in which the path chosen is that which is the most efficient for a given class. It appears that the most appropriate path for a given integral class depends on the relative position of the contraction step (lowly contracted bra-kets prefer late contraction, highly contracted bra-kets are most efficient with early contraction steps).

Careful studies have provided FLOP counts which are the current basis of integral algorithm selection, although care must be taken to ensure that algorithms are not rate limited by MOPs. 380 Frisch M. J. et al.
Chem. Phys. Lett.
(1993), 206, pp. 225.
Link
Future algorithm selection criteria will take greater account of memory, disk, chip architecture, cache size, vectorization and parallelization characteristics of the hardware, many of which are already exist within Q-Chem.

A.2.4.3 More Efficient Hartree–Fock Gradient and Hessian Evaluations

Q-Chem combines the Head-Gordon–Pople (HGP) method 504 Head-Gordon M., Pople J. A.
J. Chem. Phys.
(1988), 89, pp. 5777.
Link
and the COLD prism method 22 Adams T. R., Adamson R. D., Gill P. M. W.
J. Chem. Phys.
(1997), 107, pp. 124.
Link
for Hartree-Fock gradient and Hessian evaluations. All two-electron four-center integrals are classified according to their angular momentum types and degrees of contraction. For each type of integrals, the program chooses one with a lower cost. In practice, the HGP method is chosen for most integral classes in a gradient or Hessian calculation, and thus it dominates the total CPU time.

Recently the HGP codes within Q-Chem were completely rewritten for the evaluation of the P IIx P term in the gradient evaluation, and the P IIxy P term in the Hessian evaluation. Our emphasis is to improve code efficiency by reducing cache misses rather than by reducing FLOP counts. Some timing results from a Hartree-Fock calculation on the azidothymidine (AZT) molecule are shown below.

Basis Set AIX Linux
Gradient Evaluation: P IIx P Term
Old New New/Old Old New New/Old
3-21G 34 s 20 s 0.58 25 s 14 s 0.56
6-31G** 259 s 147 s 0.57 212 s 120 s 0.57
DZ 128 s 118 s 0.92 72 s 62 s 0.86
cc-pVDZ 398 s 274 s 0.69 308 s 185 s 0.60
Hessian Evaluation: P IIxy P term
Old New New/Old Old New New/Old
3-21G 294 s 136 s 0.46 238 s 100 s 0.42
6-31G** 2520 s 976 s 0.39 2065 s 828 s 0.40
DZ 631 s 332 s 0.53 600 s 230 s 0.38
cc-pVDZ 3202 s 1192 s 0.37 2715 s 866 s 0.32
Table A.1: AIX timings were obtained on an IBM RS/6000 workstation with AIX4 operating system, and Linux timings on an Opteron cluster where the Q-Chem executable was compiled with an Intel 32-bit compiler.