Q-Chem currently offers the possibility of accelerating RI-MP2 calculations using graphics processing units (GPUs). Currently, this is implemented for CUDA-enabled NVIDIA graphics cards only, such as (in historical order from 2008) the GeForce, Quadro, Tesla and Fermi cards. More information about CUDA-enabled cards is available at
http://www.nvidia.com/object/cuda_gpus.html
It should be noted that these GPUs have specific power and motherboard requirements.
Software requirements include the installation of the appropriate NVIDIA CUDA driver (at least version 1.0, currently 3.2) and linear algebra library, CUBLAS (at least version 1.0, currently 2.0). These can be downloaded jointly in NVIDIA’s developer website:
We have implemented a mixed-precision algorithm in order to get better than single precision when users only have single-precision GPUs. This is accomplished by noting that RI-MP2 matrices have a large fraction of numerically “small” elements and a small fraction of numerically “large” ones. The latter can greatly affect the accuracy of the calculation in single-precision only calculations, but calculation involves a relatively small number of compute cycles. So, given a threshold value , we perform a separation between “small” and “large” elements and accelerate the former compute-intensive operations using the GPU (in single-precision) and compute the latter on the CPU (using double-precision). We are thus able to determine how much double-precision we desire by tuning the parameter, and tailoring the balance between computational speed and accuracy.
CUDA_RI-MP2
CUDA_RI-MP2
Enables GPU implementation of RI-MP2
TYPE:
LOGICAL
DEFAULT:
FALSE
OPTIONS:
FALSE
GPU-enabled MGEMM off
TRUE
GPU-enabled MGEMM on
RECOMMENDATION:
Necessary to set to 1 in order to run GPU-enabled RI-MP2
USECUBLAS_THRESH
USECUBLAS_THRESH
Sets threshold of matrix size sent to GPU
(smaller size not worth sending to GPU).
TYPE:
INTEGER
DEFAULT:
250
OPTIONS:
n
user-defined threshold
RECOMMENDATION:
Use the default value. Anything less can
seriously hinder the GPU acceleration
USE_MGEMM
USE_MGEMM
Use the mixed-precision matrix scheme (MGEMM) if you want to make calculations
in your card in single-precision (or if you have a single-precision-only GPU),
but leave some parts of the RI-MP2 calculation in double precision)
TYPE:
LOGICAL
DEFAULT:
FALSE
OPTIONS:
FALSE
MGEMM disabled
TRUE
MGEMM enabled
RECOMMENDATION:
Use when having single-precision cards
MGEMM_THRESH
MGEMM_THRESH
Sets MGEMM threshold to determine the separation
between “large” and “small” matrix elements.
A larger threshold value will result in a value closer
to the single-precision result. Note that the desired factor
should be multiplied by 10000 to ensure an integer value.
TYPE:
INTEGER
DEFAULT:
10000
(corresponds to 1)
OPTIONS:
User-specified threshold
RECOMMENDATION:
For small molecules and basis sets up to triple-, the
default value suffices to not deviate too much from the
double-precision values. Care should be taken to reduce
this number for larger molecules and also larger basis-sets.
$molecule 0 1 c h1 c 1.089665 h2 c 1.089665 h1 109.47122063 h3 c 1.089665 h1 109.47122063 h2 120. h4 c 1.089665 h1 109.47122063 h2 -120. $end $rem METHOD rimp2 BASIS cc-pvdz AUX_BASIS rimp2-cc-pvdz CUDA_RIMP2 1 $end
$molecule 0 1 c h1 c 1.089665 h2 c 1.089665 h1 109.47122063 h3 c 1.089665 h1 109.47122063 h2 120. h4 c 1.089665 h1 109.47122063 h2 -120. $end $rem METHOD rimp2 BASIS cc-pvdz AUX_BASIS rimp2-cc-pvdz CUDA_RIMP2 1 USE_MGEMM 1 MGEMM_THRESH 10000 $end