# Calculations of enthalpies of formation

This tutorial describes how to calculate enthalpies of formation with ANI-1ccx and AIQM1 approaches as described in:

Peikun Zheng, Wudi Yang, Wei Wu, Olexandr Isayev, Pavlo O. Dral, Toward Chemical Accuracy in Predicting Enthalpies of Formation with General-Purpose Data-Driven Methods, J. Phys. Chem. Lett.

2022, 13, 3479–3491. DOI: 10.1021/acs.jpclett.2c00734.

Please cite this paper in the publications using the above calculations.

## Installation

Required installations and other types of calculations with AIQM1 are given in a dedicated tutorial. Below minimum installation requirements are given for ANI-1ccx.

### MLatom

You need MLatom version 2.2 or newer.

It is recommended to install the latest MLatom release as described in download page. See manual pages on this website for license and further instructions.

### TorchANI

TorchANI program is required to provide the neural network (NN) part of AIQM1.

TorchANI is an open-source package. The latest version when writing this tutorial is v2.2.

**Installation**

1. install *Numpy* and nightly version of *PyTorch* (if you do not have them already):

`pip install numpy`

`pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu100/torch_nightly.html`

2. install TorchANI:

`pip install torchani`

Visit https://aiqm.github.io/torchani/ for more info. The latest version of TorchANI used when writing this tutorial is v2.2, you can install this version by `pip install torchani==2.2`

if there are any problems when running with the newest version of TorchANI. The CUDA extension for AEV calculation is not supported for the NN part of AIQM1 and ANI-1ccx now.

### Other packages

To perform geometry optimizations or calculations of enthalpies of formation, MLatom relies on either Gaussian or ASE package (any of them can be used; the first one is commercial, the second one is open-source).

#### Gaussian

Our implementation work with both Gaussian 09 and Gaussian 16. It is a commercial program, which can be obtained and installed separately.

To use Gaussian interface, make sure that your environmental variable `GAUSS_EXEDIR`

points to the right place.

#### ASE

The ASE (Atomic Simulation Environment) are Python modules, which can be installed as described on ASE website.

## Usage

To perform calculations, you can provide input files with appropriate options to MLatom (command line options also supported) and then run MLatom calculations as usual (see manual and tutorial pages on this website). Below we provide examples of some typical uses. In all cases, we use `mlatom`

as alias to `$pathToMLatom/MLatom.py`

(it is useful to setup such an alias in your shell).

The calculations of enthalpies of formation consist of two steps: geometry optimization and frequency calculations.

### Geometry optimization

Geometry optimization of closed-shell molecules in electronic ground state is as simple as running single point calculations and 4-line MLatom input file, e.g., `opt.inp`

, looks like this:

`ANI-1ccx xyzfile=opt.xyz`

`optxyz=final_geo.xyz geomopt optprog=gaussian # or optprog=ase if you choose ASE`

This input requires `opt.xyz`

file with initial XYZ geometries of molecules to be optimized (you can provide many molecules as usual for MLatom), e.g., for hydrogen and methane `opt.xyz`

file can look like (geometries in Å):

```
2
H 0.0000000000 0.0000000000 0.0000000000
H 0.7414000000 0.0000000000 0.0000000000
5
C 0.0000000000 0.0000000000 0.0000000000
H 1.0870000000 0.0000000000 0.0000000000
H -0.3623333220 -1.0248334322 -0.0000000000
H -0.3623333220 0.5124167161 -0.8875317869
H -0.3623333220 0.5124167161 0.8875317869
```

After you prepared your input files

and `opt`

.inp`opt.xyz`

, you can run MLatom as usual:

`mlatom opt.inp`

After the calculations finish, the optimized geometries are saved in finla_geo.xyz.

If you use Gaussian, the Gaussian output files of geometry optimizations are saved in `mol_1.log`

, `mol_2.log`

, … files for each molecule, where you can find the optimized geometries and visualize them by GausianView program as in any other typical Gaussian job.

If you use ASE, you can directly obtain optimized XYZ geometries in a single file final_geo.xyz with MLatom format, which for our example looks like (geometries in Å):

```
2
H 0.15255733 0.00000000 0.00000000
H 0.58884267 0.00000000 0.00000000
5
C 0.00000000 0.00000000 0.00000000
H 1.08733372 -0.00000000 0.00000000
H -0.36244456 -1.02514806 0.00000000
H -0.36244456 0.51257403 -0.88780426
H -0.36244456 0.51257403 0.88780426
```

### Frequency calculations

Thermochemical properties of closed-shell molecules in the electronic ground state can be calculated at ANI-1ccx or AIQM1 level by adding an argument `freq`

to the MLatom input file, e.g., `freq.inp`

, and they should be run on geometries optimized with the corresponding model. An example of MLatom input file using the ANI-1ccx-optimized geometries:

```
ANI-1ccx
xyzfile=optgeoms.xyz
freq
optprog=ASE # or optprog=gaussian if you choose Gaussian
ase.linear=1,0
ase.symmetrynumber=2,12
```

When ASE is used for the calculation of thermochemical properties, you should specify

and `ase.linear`

this two keywrods. `ase.symmetrynumber`

is 0 for nonlinear molecule, 1 for linear molecule, and `ase.linear`

is the rotational symmetry number for each molecule (see Table 10.1 and Appendix B of C. Cramer “Essentials of Computational Chemistry”, 2nd Ed.). For example, for hydrogen and methane this two molecules, you should set `ase.symmetrynumber`

and `ase.linear=1,0`

.`ase.symmetrynumber=2,12`

File with preoptimized geometries `optgeoms.xyz`

for our example are (geometries in Å):

```
2
hydrogen
H 0.15255733 0.00000000 0.00000000
H 0.58884267 0.00000000 0.00000000
5
methane
C 0.00000000 0.00000000 0.00000000
H 1.08733372 -0.00000000 0.00000000
H -0.36244456 -1.02514806 0.00000000
H -0.36244456 0.51257403 -0.88780426
H -0.36244456 0.51257403 0.88780426
```

After you prepared your input files `freq.inp`

and `optgeoms.xyz`

, you can run MLatom as usual:

`mlatom freq.inp > freq.out`

After the calculations finish, MLatom output `freq.out`

will contain the summary with atomization enthalpy at 0 K, ZPVE-exclusive atomization energy at 0 K, and heat of formation at 298.15 K for each molecule. If you use ASE, MLatom output will contain the same lines as above, but also include additional data such as entropy and the Gibbs free energy:

...... Zero-point vibrational energy : 4.07528 kcal/mol Atomization enthalpy at 0 K : 126.42974 kcal/mol ZPE exclusive atomization energy at 0 K : 130.50502 kcal/mol Heat of formation at 298.15 K : -23.11424 kcal/mol * Warning * Heat of formation have high uncertainty! ...... Zero-point vibrational energy : 27.87144 kcal/mol Atomization enthalpy at 0 K : 391.92513 kcal/mol ZPE exclusive atomization energy at 0 K : 419.79657 kcal/mol Heat of formation at 298.15 K : -17.63420 kcal/mol

If you use Gaussian, the Gaussian output files of frequency calculations are saved in `mol_1.log`

, `mol_2.log`

, … files for each molecule; these files contain ZPVE energy and lots of thermochemical data such as entropy and the Gibbs free energy.

## Leave a Reply