# AIQM1

AIQM1 (artificial intelligence–quantum mechanical method 1) is a general-purpose method approaching the gold-standard coupled cluster quantum mechanical method with high computational speed of the approximate low-level semiempirical quantum mechanical methods for the ground-state, closed-shell species, but also transferable for calculation of charged and radical species as well as for excited-state calculations with a good accuracy. See AIQM1 paper [Peikun Zheng, Roman Zubatyuk, Wei Wu, Olexandr Isayev, Pavlo O. Dral, Artificial Intelligence-Enhanced Quantum Chemical Method with Broad Applicability, *Nat. Commun.*, **2021**, *12*, 7022, DOI: 10.1038/s41467-021-27340-2] for more details. Please cite this paper in the publications using the AIQM1 method.

**Strengths:** AIQM1 is especially good for energy calculations and geometry optimizations of closed-shell molecules in their ground-state.

**Limitations: **This method is currently limited to compounds only containing H, C, N, and O elements.

**Availability: **AIQM1 is available in MLatom package interfaced to the MNDO program, TorchANI, and dftd4. Some features also require interfaces to the Gaussian or ASE packages. Installation instructions, a usage manual, and proper citations are given below.

Table of Contents

## Installation

### MLatom

It is recommended to install the latest MLatom release as described in download page. See manual pages on this website for license and further instructions.

### MNDO

MNDO program is required to provide the ODM2* part of AIQM1.

The free binary and open-source code of the MNDO program is available from the official distributors of the MNDO code as described at https://www.kofo.mpg.de/en/institute/history/1993-to-present/theoretical-chemistry.

After the MNDO program is installed, you need to set the environmental variable pointing to the MNDO executable (typically `mndo99`

), e.g., in bash:

`export mndobin=[path to the executable]/mndo99`

### TorchANI

TorchANI program is required to provide the neural network (NN) part of AIQM1.

TorchANI is an open-source package. The latest version when writing this tutorial is v2.2

**Installation**

1. install *Numpy* and nightly version of *PyTorch* (if you do not have them already):

`pip install numpy`

`pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu100/torch_nightly.html`

2. install TorchANI:

`pip install torchani`

Visit https://aiqm.github.io/torchani/ for more info. The latest version of TorchANI used when writing this tutorial is v2.2, you can install this version by `pip install torchani==2.2`

if there have problem when running with the newest verison of TorchANI. The CUDA extension for AEV calculation is not supported for the NN part of AIQM1 now.

### dftd4

dftd4 program is required to provide the D4 part of AIQM1.

The dftd4 program can be obtained as both executable and open-source code. We recommend to use dftd4 v2.5.0, which can calculate Hessian needed for thermochemical calculations. To install the dftd4 program from source code, please see the README.md file on dftd4 GitHub page for more details.

After the dftd4 program is installed, you need to set the environmental variable pointing to the `dftd4`

executable, e.g., in bash:

`export dftd4bin=[path to the executable]/dftd4`

### Optional packages for additional features

To perform geometry optimizations or calculations of heats of formation, MLatom relies on either Gaussian or ASE package (any of them can be used; the first one is commercial, the second one is open-source).

#### Gaussian

Our implementation work with both Gaussian 09 and Gaussian 16. It is a commercial program, which can be obtained and installed separately.

To use Gaussian interface, make sure that your environmental variable `GAUSS_EXEDIR`

points to the right place.

#### ASE

The ASE (Atomic Simulation Environment) are Python modules, which can be installed as described on ASE website.

## Usage

To perform AIQM1 calculations, you can provide input files with appropriate options to MLatom (command line options also supported) and then run MLatom calculations as usual (see manual and tutorial pages on this website). Below we provide examples of some typical uses. In all cases, we use `mlatom`

as alias to `$pathToMLatom/MLatom.py`

(it is useful to setup such an alias in your shell).

### Get started: Simplest calculation job

Single-point calculations of energies (and gradients if needed) of closed-shell molecules in electronic ground state is the simplest job, which can be run with 3-4 line MLatom input file, e.g., `sp.inp`

:

```
AIQM1 # or AIQM1@DFT or AIQM1@DFT* if you want to use these methods
xyzfile=sp.xyz
yestfile=enest.dat
ygradxyzestfile=gradest.dat
```

This input requires a `sp.xyz`

file with XYZ geometries of molecules (you can provide many molecules as usual for MLatom), e.g., for hydrogen and methane `sp.xyz`

file can look like (geometries in Å):

```
2
H 0.000000 0.000000 0.363008
H 0.000000 0.000000 -0.363008
5
C 0.000000 0.000000 0.000000
H 0.627580 0.627580 0.627580
H -0.627580 -0.627580 0.627580
H 0.627580 -0.627580 -0.627580
H -0.627580 0.627580 -0.627580
```

After you prepared your input files

and `sp`

.inp`sp.xyz`

, you can run MLatom as usual:

`mlatom sp.inp > sp.out`

After the calculations finish, MLatom output `sp.out`

will contain the standard deviation of NN prediction and compoents of AIQM1 energies:

```
Standard deviation of NN contribution : 0.00892407 Hartree 5.59994 kcal/mol
NN contribution : -0.00210898 Hartree
Sum of atomic self energies : -0.08587317 Hartree
ODM2* contribution : -1.09094119 Hartree
D4 contribution : -0.00000889 Hartree
Total energy : -1.17893224 Hartree
Standard deviation of NN contribution : 0.00025608 Hartree 0.16069 kcal/mol
NN contribution : 0.00958812 Hartree
Sum of atomic self energies : -33.60470494 Hartree
ODM2* contribution : -6.86968756 Hartree
D4 contribution : -0.00010193 Hartree
Total energy : -40.46490632 Hartree
```

You can also get AIQM1 energies saved in file `enest.dat`

looking like this (energies in Hartree):

```
-1.178932238420
-40.464906315250
```

and gradients saved in file `gradest.dat`

looking like this (gradients in Hartree/Å):

```
2
0.000000000000 0.000000000000 0.000032023551
0.000000000000 0.000000000000 -0.000032023551
5
-0.000000000000 -0.000000000000 0.000000000000
0.000490470799 0.000490470714 0.000490470881
-0.000490470799 -0.000490470714 0.000490470881
0.000490470799 -0.000490470714 -0.000490470881
-0.000490470799 0.000490470714 -0.000490470881
```

Note that your output may have very minor numerical differences.

### Geometry optimization

Geometry optimization of closed-shell molecules in electronic ground state is as simple as running single point calculations and 4-line MLatom input file, e.g., `opt.inp`

, looks like this:

```
AIQM1
xyzfile=opt.xyz
optxyz=final_geo.xyz
geomopt
optprog=gaussian # or optprog=ase if you choose ASE
```

This input requires `opt.xyz`

file with initial XYZ geometries of molecules to be optimized (you can provide many molecules as usual for MLatom), e.g., for hydrogen and methane `opt.xyz`

file can look like (geometries in Å):

```
2
H 0.0000000000 0.0000000000 0.0000000000
H 0.7414000000 0.0000000000 0.0000000000
5
C 0.0000000000 0.0000000000 0.0000000000
H 1.0870000000 0.0000000000 0.0000000000
H -0.3623333220 -1.0248334322 -0.0000000000
H -0.3623333220 0.5124167161 -0.8875317869
H -0.3623333220 0.5124167161 0.8875317869
```

After you prepared your input files

and `opt`

.inp`opt.xyz`

, you can run MLatom as usual:

`mlatom opt.inp`

After the calculations finish, the optimized geometries are saved in either Gaussian or ASE-style output files.

If you use Gaussian, the Gaussian output files of geometry optimizations are saved in `mol_1.log`

, `mol_2.log`

, … files for each molecule, where you can find the optimized geometries and visualize them by GausianView program as in any other typical Gaussian job.

If you use ASE, you can directly obtain optimized XYZ geometries in a single file final_geo.xyz with MLatom format, which for our example looks like (geometries in Å):

```
2
H 0.00770082 0.00000000 0.00000000
H 0.73369918 0.00000000 0.00000000
5
C 0.00000000 0.00000000 0.00000000
H 1.08666998 -0.00000000 0.00000000
H -0.36222332 -1.02452229 -0.00000000
H -0.36222332 0.51226114 -0.88726233
H -0.36222332 0.51226114 0.88726233
```

### Calculation of thermochemical properties

Thermochemical properties of closed-shell molecules in electronic ground state can be calculated at AIQM1 level by adding an option freq to the MLatom input file, e.g., `freq.inp`

and they are typically run on AIQM1-optimized geometries. An example of MLatom input file:

```
AIQM1
xyzfile=freq.xyz
freq
optprog=gaussian # or optprog=ase if you choose ASE
```

This input requires `freq.xyz`

file with initial XYZ geometries of molecules (you can provide many molecules as usual for MLatom), e.g., for hydrogen and methane `freq.xyz`

file can look like (geometries in Å):

```
2
H 0.000000 0.000000 0.363008
H 0.000000 0.000000 -0.363008
5
C 0.000000 0.000000 0.000000
H 0.627580 0.627580 0.627580
H -0.627580 -0.627580 0.627580
H 0.627580 -0.627580 -0.627580
H -0.627580 0.627580 -0.627580
```

After you prepared your input files

and `freq`

.inp`freq.xyz`

, you can run MLatom as usual:

`mlatom freq.inp > freq.out`

After the calculations finish, MLatom output `freq.out`

will contain the summary with atomization enthalpy at 0 K, ZPVE-exclusive atomization energy at 0 K, and heat of formation at 298 K for each molecule.

If you use Gaussian, the Gaussian output files of frequency calculations are saved in `mol_1.log`

, `mol_2.log`

, … files for each molecule; these files contain ZPVE energy and lots of thermochemical data such as entropy and the Gibbs free energy. MLatom output will contain additional information as described above with following lines for hydrogen and methane example:

```
Standard deviation of NN contribution : 0.00892407 Hartree 5.59994 kcal/mol
NN contribution : -0.00210898 Hartree
Sum of atomic self energies : -0.08587317 Hartree
ODM2* contribution : -1.09094119 Hartree
D4 contribution : -0.00000889 Hartree
Total energy : -1.17893224 Hartree
Atomization enthalpy at 0 K : 106.17239 kcal/mol
ZPE exclusive atomization energy at 0 K : 111.17678 kcal/mol
Heat of formation at 298.15 K : -2.85652 kcal/mol
* Warning * Heat of formation have high uncertainty!
Standard deviation of NN contribution : 0.00025608 Hartree 0.16069 kcal/mol
NN contribution : 0.00958812 Hartree
Sum of atomic self energies : -33.60470494 Hartree
ODM2* contribution : -6.86968756 Hartree
D4 contribution : -0.00010193 Hartree
Total energy : -40.46490632 Hartree
Atomization enthalpy at 0 K : 391.58894 kcal/mol
ZPE exclusive atomization energy at 0 K : 419.90907 kcal/mol
Heat of formation at 298.15 K : -17.30543 kcal/mol
```

When you use ASE for the calculation of thermochemical properties, you should specify `ase.linear`

and `ase.symmetrynumber`

this two keywrods. `ase.linear`

is 0 for nonlinear molecule, 1 for linear molecule, and `ase.symmetrynumber`

is the rotational symmetry number for each molecule, (see Table 10.1 and Appendix B of C. Cramer “Essentials of Computational Chemistry”, 2nd Ed.). For example, for hydrogen and methane this two molecules, you should set `ase.linear=1,0`

and `ase.symmetrynumber=2,12`

.

If you use ASE, MLatom output will contain the same lines as above, but also include additional data such as entropy and the Gibbs free energy.

### Beyond closed-shell molecules in ground state

If we go beyond closed-shell molecules in their ground-state, you need to tell MLatom the charge and multiplicity and/or what excited-state properties you want to calculate. Since this information will change the semiempirical quantum mechanical part of AIQM1, you have to inform the MNDO program to do the corresponding calculations. MLatom implementation therefore supports the request to read user-defined MNDO keywords via option `mndokeywords=[file name with MNDO keywords]`

, which are passed to the MNDO program. Please consult the MNDO program manual for the available keywords. Note: whenever you request special keywords for MNDO, keywords `iop=-22 immdp=-1 nsav15=3 igeom=1 iform=1`

are always required and AIQM1 calculations can be run only for a single molecule in xyz file.

Below we show several typical examples for calculating charged species and excited-state properties.

#### Calculations of charged species and radicals

For example, if we want to optimize geometry of protonated water H_{3}O^{+}, then you can use the following MLatom input `opt.inp`

:

`AIQM1 xyzfile=opt.xyz`

`optxyz=final_geo.xyz geomopt optprog=ase mndokeywords=mndokw`

This input requires `opt.xyz`

file with initial XYZ geometries of molecules to be optimized, e.g. (geometries in Å):

```
4
O 0.00000000 0.00000000 0.08727273
H 0.00000000 0.90509668 -0.23272727
H -0.78383672 -0.45254834 -0.23272727
H 0.78383672 -0.45254834 -0.23272727
```

and `mndokw`

file with MNDO keywords:

```
iop=-22 immdp=-1 +
igeom=1 iform=1 +
jop=-2 nsav15=3 +
kharge=1 imult=0
```

The first line in `mndokw`

file requests the use of the ODM2* Hamiltonian (you do not need to modify it), the second line specifies that geometry is given in a free XYZ format, the third line requests calculation of gradients and saving energies and gradients into fort.15 file (required for geometry optimization), the last line sets charge (kharge=1) and multiplicity (imult=0).

After you prepared your input files

, `opt`

.inp`opt.xyz`

, and `mndokw`

you can run MLatom as usual:

`mlatom opt.inp`

The output geometry is saved in final_geo.xyz with MLatom format, which for our example looks like (geometries in Å):

```
4
O 0.00000000 -0.00000000 0.12401796
H 0.00000000 0.90385318 -0.24497568
H -0.78275982 -0.45192659 -0.24497568
H 0.78275982 -0.45192659 -0.24497568
```

When you want to calculate the heat of formation of H_{3}O^{+} with the above optimized geometry, then you can use the following MLatom input `freq.inp`

:

```
AIQM1
xyzfile=final_geo.xyz
freq
optprog=ase
mndokeywords=mndokw
```

and `mndokw`

file with MNDO keywords:

```
iop=-22 immdp=-1 +
igeom=1 iform=1 +
jop=2 nsav15=3 +
kharge=1 imult=0
```

After you run MLatom by `mlatom freq.inp > freq.out`

, you can get the following heat of formation from the output:

`Heat of formation at 298.15 K : 150.84947 kcal/mol`

`* Warning * Heat of formation have high uncertainty!`

#### Vertical excitation energies

For example, if we want to calculate the vertical excitation energies of ethene in ^{1}B_{1u} state, then you can use the following MLatom input `vee.inp`

:

```
AIQM1
xyzfile=vee.xyz
mndokeywords=mndokw
```

This input requires a `vee.xyz`

file with XYZ geometries of molecules, e.g. (geometries in Å):

```
6
Ethene 1B1u
C 0.000000 0.000000 0.668188
C 0.000000 0.000000 -0.668188
H 0.000000 0.923274 1.238289
H 0.000000 -0.923274 1.238289
H 0.000000 0.923274 -1.238289
H 0.000000 -0.923274 -1.238289
```

and `mndokw`

file with MNDO keywords:

```
iop=-22 immdp=-1 +
igeom=1 iform=1 +
jop=-2 nsav15=3 +
kci=5 ici1=1 ici2=1 iroot=2 ioutci=2 +
movo=-1 nciref=1 mciref=3 levexc=2
```

After you prepared your input files `vee.inp`

, `vee.xyz`

, and `mndokw`

you can run MLatom as usual:

`mlatom vee.inp`

The output of vertical excitation energy is saved in the MNDO program outfile `mndo.out`

, where you can see:

`State 2, Mult. 1, B1u (4), E-E(1)= 7.910791 eV, E= -304.054934 eV`

#### Excited-state geometry optimization

For example, if we want to optimize geometry of formaldehyde in ^{1}n*π** excited state, then you can use the following MLatom input `opt.inp`

:

`AIQM1 xyzfile=opt.xyz`

`optxyz=final_geo.xyz geomopt optprog=ase mndokeywords=mndokw`

This input requires `opt.xyz`

file with initial XYZ geometries of molecules to be optimized, e.g. (geometries in Å):

```
4
formaldehyde 1npi*
C -0.02667227 0.64339915 -0.00000000
O -0.02667227 -0.71674790 0.00000000
H 0.18670592 0.93679416 1.04362466
H 0.18670592 0.93679416 -1.04362466
```

and `mndokw`

file with MNDO keywords:

```
iop=-22 immdp=-1 +
igeom=1 iform=1 +
jop=-2 nsav15=3 +
kci=5 ici1=6 ici2=4 jci1=1 jci2=1 ncisym=2 +
movo=-1 nciref=1 mciref=3 levexc=2 iroot=1 lroot=1
```

The first line in `mndokw`

file requests the use of the ODM2* Hamiltonian (you do not need to modify it), the second line specifies that geometry is given in a free XYZ format, the third line requests calculation of gradientsand saving energies and gradients into fort.15 file (required for geometry optimization), and the following lines setup the type of excited-state calculations. `kci=5`

requests GUGA-CI approach for excited state calculation; `ici1`

and `ici2`

define the number of active occupied orbitals and unoccupied orbitals respectively; `jci1`

and `jci2`

define the number of occupied and unoccupied pi-MOs included in the active space; `ncisym`

define the state sysmetry. `nciref`

and `mciref`

define the number of reference occupations and definition of reference occupations respectively; `levexc`

define the maximum excitation level; `iroot`

and `lroot`

is the number of lowest CI states computed and the number of the CI state of interest.

After you prepared your input files

, `opt`

.inp`opt.xyz`

, and `mndokw`

you can run MLatom as usual:

`mlatom opt.inp`

The output geometry is saved in final_geo.xyz with MLatom format, which for our example looks like (geometries in Å):

```
4
C -0.10906735 0.55934608 -0.00000000
O -0.02784720 -0.77782274 0.00000000
H 0.22849093 1.00935811 0.92370071
H 0.22849093 1.00935811 -0.92370071
```

## Citations

If you have used AIQM1, then the following citations are appropriate in your publication:

**AIQM1:**P. Zheng, R. Zubatyuk, W. Wu, O. Isayev, P. O. Dral, Artificial Intelligence-Enhanced Quantum Chemical Method with Broad Applicability,*Nat. Commun.*,**2021**,*12*, 7022. DOI: 10.1038/s41467-021-27340-2.

**MLatom:**Pavlo O. Dral, Fuchun Ge, Bao-Xin Xue, Yi-Fan Hou, Max Pinheiro Jr, Jianxing Huang, Mario Barbatti, MLatom 2: An Integrative Platform for Atomistic Machine Learning.*Top. Curr. Chem.***2021**,*379*, 27. DOI: 10.1007/s41061-021-00339-5.**MLatom:**Pavlo O. Dral, Peikun Zheng, Bao-Xin Xue, Fuchun Ge, Yi-Fan Hou, Max Pinheiro Jr,*MLatom: A Package for Atomistic Simulations with Machine Learning*, version 2.0.1beta. Xiamen University, Xiamen, China,**2013–2021**. http://MLatom.com.**ODM2* Hamiltonian:**P. O. Dral, X. Wu, W. Thiel,*J. Chem. Theory Comput.***2019**,*15*, 1743.**MNDO program:**W. Thiel, MNDO [check your version](Max-Planck-Institut fur Kohlenforschung, Mulheim an der Ruhr, [year])**D4:**E. Caldeweyher, C. Bannwarth, S. Grimme, J. Chem. Phys. 2017, 147, 034112**D4 program:**E. Caldeweyher, S. Ehlert, S. Grimme, DFT-D4, Version [check your version], (Mulliken Center for Theoretical Chemistry, University of Bonn, [year])**ANI model:**J. S. Smith, O. Isayev, A. E. Roitberg, Chem. Sci. 2017, 8, 3192**TorchANI program:**X. Gao, F. Ramezanghorbani, O. Isayev, J. S. Smith, A. E. Roitberg,*J. Chem. Inf. Model.***2020**,*60*, 3408

## Leave a Reply