Keyword ERIS

This keyword controls the calculation method for the three-center electron repulsion integrals (ERIs).
Options:
MULTIPOLE / CONVENTIONAL / DIRECT / MIXED
MULTIPOLE All ERIs are recalculated at each SCF iteration. For long-range ERIs the double asymptotic expansion is used. This is the default.
CONVENTIONAL All ERIs are calculated at the beginning of the SCF procedure and stored.
DIRECT All ERIs are recalculated at each SCF iteration.
MIXED Short-range ERIs are calculated at the beginning of the SCF procedure and stored in RAM. Long-range ERIs are recalculated in each SCF iteration employing the double asymptotic expansion.
RAM=$<$Real$>$ Specifies the RAM per core [in MB] usable in the calculation.
TOL=$<$Real$>$ Threshold for ERI screening.
Description:
The ERIS default option MULTIPOLE for the three-center ERI calculations is a compromise between computational performance and memory (RAM) demand [187]. Because no ERIs are stored the full RAM size is available for SCF matrices. At the same time the double asymptotic expansion for the long-range ERIs [74] improves the computational performance. The ERIS option DIRECT [186] is similar to the MULTIPOLE option. The only difference is that the double asymptotic expansion for the long-range ERIs is disabled. Thus, all ERIs are recalculated by recurrence relations [51] twice in each SCF iteration. As a result, the DIRECT option is always computationally more demanding than the MULTIPOLE option. It should only be used for testing and benchmarking.

Whereas the MULTIPOLE option is the method of choice for calculations that are memory bound ($\ge$ 25000 basis functions), the MIXED option can be computationally beneficial for smaller systems, particularly in parallel runs with many SCF iterations. With this option the short-range ERIs are calculated only once before the SCF procedure and stored in RAM. The long-range ERIs are calculated in each SCF iteration employing the double asymptotic expansion. In parallel runs the ERI storage is distributed over all cores that have free RAM space, i.e. that are not involved in the allocation of SCF matrices. In Table 10 average timings per SCF cycle for PBE/DZVP/GEN-A2 calculations of n-alkane chains with the DIRECT, MULTIPOLE and MIXED option of the ERIS keyword are shown. Also listed are the number of basis functions, $N_{Basis}$, the number of auxiliary functions, $N_{Auxis}$, and the number of SCF cycles until convergence is reached. Note that the number of SCF cycles was the same for all options and that the converged energies were identical to $10^{-7}$ a.u. or better. All calculations were performed in parallel on a single compute node with 2 octo-core Intel Xeon E5-2650v2 CPUs @ 2.60GHz with a total of 64 GB RAM.


Table 10: Average time per SCF cycle [sec] for PBE/DZVP/GEN-A2 calculations of n-alkanes using different options of the ERIS keyword.
n-Alkane $N_{Basis}$ $N_{Auxis}$ SCF cycles DIRECT MULTIPOLE MIXED
C$_{100}$H$_{202}$ 2510 4208 15 19 13 11
C$_{150}$H$_{302}$ 3760 6308 16 57 42 39
C$_{200}$H$_{402}$ 5010 8408 16 122 97 89
C$_{250}$H$_{502}$ 6260 10508 39 182 137 136
C$_{300}$H$_{602}$ 7510 12608 39 294 229 226


Table 10 shows that the ERIS option choice makes a noticeable difference for smaller systems where the linear algebra tasks are still not dominant. For such systems the MIXED option is advisable, particularly for Born-Oppenheimer molecular dynamics simulations (see Section 4.7).

To monitor the RAM space for the ERIS MIXED option a RAM allocation table can be printed with PRINT RAM (see also 4.12.2). The following example shows such a table for a C$_{24}$H$_{50}$ calculation with the aug-cc-pVQZ basis (5270 basis functions) and GEN-A2 auxiliary function set employing 32 cores with 4 GB RAM each.


 *** ERI STATISTIC ***

 Est. Integrated ERIs: 8486773340
 Est. Asymptotic ERIS: 96798170


 *** RAM Allocation ***

 Program part        Size in MBytes

 SCF Kernel                 905.121
 ERI Kernel                   4.416
 DAE Kernel                   1.152
 FIT Kernel                   0.613
 LAG Kernel                 225.340

 Max RAM                   4096.000
 Max SHM                  65536.000

 Integrated ERIs: Incore on 31 CPUs
 Asymptotic ERIs: Direct SCF method


 *** Incore ERI Storage ***

                         Sizes [MBytes]
 #CPU        #ERIs    ERI Vector   Max. Size
    0             SCF kernel allocation
    1     277791345    2119.380    3614.660
    2     273321035    2085.274    3614.660
    3     270188345    2061.373    3614.660
    4     270513580    2063.855    3614.660
    5     270640015    2064.819    3614.660
    6     271920380    2074.588    3614.660
    7     273630545    2087.635    3614.660
    8     278321900    2123.428    3614.660
    9     277542740    2117.483    3614.660
   10     279195440    2130.092    3614.660
   11     278907830    2127.898    3614.660
   12     274740355    2096.103    3614.660
   13     276984955    2113.228    3614.660
   14     276904470    2112.613    3614.660
   15     274954730    2097.738    3614.660
   16     273873285    2089.487    3614.660
   17     274457520    2093.945    3614.660
   18     272614335    2079.882    3614.660
   19     270426255    2063.189    3614.660
   20     270650360    2064.898    3614.660
   21     270687890    2065.185    3614.660
   22     270468465    2063.511    3614.660
   23     271087330    2068.232    3614.660
   24     273499175    2086.633    3614.660
   25     274764620    2096.288    3614.660
   26     273490215    2086.565    3614.660
   27     271100105    2068.330    3614.660
   28     274784945    2096.443    3614.660
   29     272456980    2078.682    3614.660
   30     272598035    2079.758    3614.660
   31     274256160    2092.408    3614.660

The ERI STATISTIC lists the estimated number of ERIs calculated by recurrence relations (Est. Integrated ERIs) and by the double asymptotic expansion (Est. Asymptotic ERIS). These numbers are estimates because screening due to density matrix elements or fitting coefficients is not included. The following RAM Allocation table lists the RAM sizes required for individual calculation tasks. These are self-consistent field (SCF) iteration, near-field ERI recurrence relation (ERI), double asymptotic expansion (DAE), density fitting (FIT) and linear algebra (LAG) operations. The following two lines, Max RAM and Max SHM, print the maximum RAM per CPU (in this case set by the MAXRAM parameter; see Table 1) and the maximum shared memory size available to the SCF matrices. Note that the maximum shared memory size for the SCF matrices is 16 times the maximum RAM per CPU because there are 16 CPUs on each board of the cluster here used. The following output is specific to ERIS MIXED. It states that the near-field ERIs (Integrated ERIs) are held in-core on 31 CPUs and that the double-asymptotically-expanded ERIs (Asymptotic ERIs) are calculated according to the direct SCF method, i.e. they are recalculated twice (once for the Kohn-Sham matrix and another time for the Coulomb vector) in each SCF iteration. As this table shows, no ERIs are stored on CPU 0 because its RAM is used for the storage of the SCF matrices. On the other 31 CPUs a little bit more than 2 GB are used for ERI storage. With the ERIS MIXED option the computational time for ERI calculation can be reduced to below 10% of the total computational time [188]. Thus it is advisable to explore if the ERIS MIXED option can be used for an application at hand. Note that in the case of a serial run with only near-field ERIs the ERIS MIXED option is equivalent to an in-core SCF.

The CONVENTIONAL option of the ERIS keyword calculates the three-center ERIs before the SCF procedure and, if possible, stores them in RAM. This so-called in-core method is fast as long as all integrals fit into the RAM. The available RAM size (as distinct from system RAM size) is set by deMon2k with the MAXRAM parameter (see Table 1) or the RAM option of the ERIS keyword. If the RAM space is not sufficient, deMon2k will write all ERIs to the scratch file ioeri.scr. As a result, the ERIs must be read from disk at each SCF step. For larger systems this disk I/O becomes the bottleneck of the calculation. Note that the printing of the ERIs in the deMon.out file enabled by PRINT ERIS (see 4.12.2) requires the ERIS CONVENTIONAL option. The same holds for PRINT DEBUG which includes PRINT ERIS.

With the RAM option of the ERIS keyword the allocatable RAM size per core can be defined in the deMon input file. This overrides the MAXRAM definition in the parameter.h file. Note that a RAM size definition larger than the available physical memory will result in large paging overhead during program execution.

Screening of the ERIs can be controlled with the TOL option. The threshold $\tau$ is calculated as:

\begin{displaymath}
\tau = {{\rm TOL} \over {\mbox{\rm Number of Electrons}}}%
\end{displaymath} (18)

ERIs with an orbital overlap smaller than $\tau$ are not calculated (screened). The threshold $\tau$ also enters into the double asymptotic expansion radii for the MULTIPOLE method. The density screening threshold for the numerical integration of the exchange-correlation potential is also given by $\tau$. The default settings for TOL are $10^{-14}$ and $10^{-10}$ a.u. for the CONVENTIONAL and DIRECT/MIXED/MULTIPOLE method, respectively. Beware that aggressive screening of the ERIs can harm SCF convergence.