A.2 Calculating Electron Repulsion Integrals (ERIs)A.2.3 Survey of ERI Evaluation A.3 User-Controllable Variables

A.2.4 Efficiency of ERI Evaluation

(November 19, 2024)

It is often stated that the cost of SCF calculations scales with size as ${\cal{O}}({N^{4}})$ , but the veracity of this assertion depends critically on what is meant by $N$ . For fixed molecule size, the cost may exhibit quartic scaling due to the growth in the number of ERIs, $(\mu\nu|\lambda\sigma)$ . However, what is more often intended in a discussion of the asymptotic cost of an electronic structure calculation is the scaling with respect to molecular size, for a fixed choice of basis set. This corresponds to measuring size ( $N$ ) in terms of the number of atoms, in which case the asymptotic cost should be ${\cal{O}}({N^{2}})$ , as discussed below.

A.2.4.1 Quadratic Scaling

The success of quantitative modern quantum chemistry, relative to its primitive, qualitative beginnings, can be traced to two sources: better algorithms and better computers. While the two technologies continue to improve rapidly, efforts are heavily thwarted by the fact that the total number of ERIs increases quadratically with the size of the molecular system. Even large increases in ERI algorithm efficiency yield only moderate increases in applicability, hindering the more widespread application of ab initio methods to areas of, perhaps, biochemical significance where semi-empirical techniques ³⁰⁵ Dewar M. J. S.
Org. Mass. Spect.
(1993), 28, pp. 305. Link have already proven so valuable.

Thus, the elimination of quadratic scaling algorithms has been the theme of many research efforts in quantum chemistry throughout the 1990s and has seen the construction of many alternative algorithms to alleviate the problem. Johnson was the first to implement DFT exchange/correlation functionals whose computational cost scaled linearly with system size. This paved the way for the most significant breakthrough in the area with the linear scaling CFMM algorithm ¹³⁶⁰ White C. A. et al.
Chem. Phys. Lett.
(1994), 230, pp. 8. Link leading to linear scaling DFT calculations. ¹³⁶² White C. A. et al.
Chem. Phys. Lett.
(1996), 253, pp. 268. Link Further breakthroughs have been made with traditional theory in the form of the QCTC ²¹⁷ Challacombe M. et al.
Modern developments in hartree-fock theory: fast methods for computing the coulomb matrix
(1996), 1, pp. 53. Link ^, ²¹⁶ Challacombe M., Schwegler E., Almlöf J.
J. Chem. Phys.
(1996), 104, pp. 4685. Link ^, ²¹⁸ Challacombe M., Schwegler E.
J. Chem. Phys.
(1997), 106, pp. 5526. Link and ONX ¹¹³⁹ Schwegler E., Challacombe M.
J. Chem. Phys.
(1996), 105, pp. 2726. Link ^, ¹¹⁴⁰ Schwegler E., Challacombe M.
J. Chem. Phys.
(1996), 106, pp. 9708. Link algorithms, while more radical approaches ²³ Adamson R. D., Dombroski J. P., Gill P. M. W.
Chem. Phys. Lett.
(1996), 254, pp. 329. Link ^, ³²⁸ Dombroski J. P., Taylor S. W., Gill P. M. W.
J. Phys. Chem.
(1996), 100, pp. 6272. Link may lead to entirely new approaches to ab initio calculations. Investigations into the quadratic Coulomb problem has not only yielded linear scaling algorithms, but is also providing large insights into the significance of many molecular energy components.

Linear scaling Coulomb and SCF exchange/correlation algorithms are not the end of the story as the ${\cal{O}}({N^{3}})$ diagonalization step has been rate limiting in semi-empirical techniques and, been predicted to become rate limiting in ab initio approaches in the medium term. ¹²²⁹ Strout D. L., Scuseria G. E.
J. Chem. Phys.
(1995), 102, pp. 8448. Link However, divide-and-conquer techniques ¹⁴¹⁰ Yang W.
Phys. Rev. A
(1991), 44, pp. 7823. Link ^, ¹⁴¹¹ Yang W.
Phys. Rev. Lett.
(1991), 66, pp. 1438. Link ^, ¹⁴⁰⁹ Yang W., Lee T.-S.
J. Chem. Phys.
(1995), 103, pp. 5674. Link ^, ⁷⁵⁰ Lee T.-S., York D. M., Yang W.
J. Chem. Phys.
(1996), 105, pp. 2744. Link and the recently developed quadratically convergent SCF algorithm ⁹⁴¹ Ochsenfeld C., Head-Gordon M.
Chem. Phys. Lett.
(1997), 270, pp. 399. Link show great promise for reducing this problem.

A.2.4.2 Algorithm Selection

No single ERI algorithm is available to efficiently handle all integral classes; rather, each tends to have specific integral classes where the specific algorithm outperforms the alternatives. The PRISM algorithm ⁴¹⁹ Gill P. M. W., Pople J. A.
Int. J. Quantum Chem.
(1991), 40, pp. 753. Link is an intricate collection of pathways and steps in which the path chosen is that which is the most efficient for a given class. It appears that the most appropriate path for a given integral class depends on the relative position of the contraction step (lowly contracted bra-kets prefer late contraction, highly contracted bra-kets are most efficient with early contraction steps).

Careful studies have provided FLOP counts which are the current basis of integral algorithm selection, although care must be taken to ensure that algorithms are not rate limited by MOPs. ³⁸⁰ Frisch M. J. et al.
Chem. Phys. Lett.
(1993), 206, pp. 225. Link Future algorithm selection criteria will take greater account of memory, disk, chip architecture, cache size, vectorization and parallelization characteristics of the hardware, many of which are already exist within Q-Chem.

A.2.4.3 More Efficient Hartree–Fock Gradient and Hessian Evaluations

Q-Chem combines the Head-Gordon–Pople (HGP) method ⁵⁰⁴ Head-Gordon M., Pople J. A.
J. Chem. Phys.
(1988), 89, pp. 5777. Link and the COLD prism method ²² Adams T. R., Adamson R. D., Gill P. M. W.
J. Chem. Phys.
(1997), 107, pp. 124. Link for Hartree-Fock gradient and Hessian evaluations. All two-electron four-center integrals are classified according to their angular momentum types and degrees of contraction. For each type of integrals, the program chooses one with a lower cost. In practice, the HGP method is chosen for most integral classes in a gradient or Hessian calculation, and thus it dominates the total CPU time.

Recently the HGP codes within Q-Chem were completely rewritten for the evaluation of the P II ${}^{x}$ P term in the gradient evaluation, and the P II ${}^{xy}$ P term in the Hessian evaluation. Our emphasis is to improve code efficiency by reducing cache misses rather than by reducing FLOP counts. Some timing results from a Hartree-Fock calculation on the azidothymidine (AZT) molecule are shown below.

Basis Set	AIX			Linux
	Gradient Evaluation: P II ${}^{x}$ P Term
	Old	New	New/Old	Old	New	New/Old
3-21G	34 s	20 s	0.58	25 s	14 s	0.56
6-31G**	259 s	147 s	0.57	212 s	120 s	0.57
DZ	128 s	118 s	0.92	72 s	62 s	0.86
cc-pVDZ	398 s	274 s	0.69	308 s	185 s	0.60
	Hessian Evaluation: P II ${}^{xy}$ P term
	Old	New	New/Old	Old	New	New/Old
3-21G	294 s	136 s	0.46	238 s	100 s	0.42
6-31G**	2520 s	976 s	0.39	2065 s	828 s	0.40
DZ	631 s	332 s	0.53	600 s	230 s	0.38
cc-pVDZ	3202 s	1192 s	0.37	2715 s	866 s	0.32

Table A.1: AIX timings were obtained on an IBM RS/6000 workstation with AIX4 operating system, and Linux timings on an Opteron cluster where the Q-Chem executable was compiled with an Intel 32-bit compiler.

Search Results

A.2.4 Efficiency of ERI Evaluation

A.2.4.1 Quadratic Scaling

A.2.4.2 Algorithm Selection

A.2.4.3 More Efficient Hartree–Fock Gradient and Hessian Evaluations