Calculations of enthalpies of formation
This tutorial describes how to calculate enthalpies of formation with ANI-1ccx and AIQM1 approaches as described in:
Peikun Zheng, Wudi Yang, Wei Wu, Olexandr Isayev, Pavlo O. Dral, Toward Chemical Accuracy in Predicting Enthalpies of Formation with General-Purpose Data-Driven Methods, J. Phys. Chem. Lett. 2022, 13, 3479–3491. DOI: 10.1021/acs.jpclett.2c00734.
Please cite this paper in the publications using the above calculations.
Installation
Required installations and other types of calculations with AIQM1 are given in a dedicated tutorial. Below minimum installation requirements are given for ANI-1ccx.
MLatom
You need MLatom version 2.2 or newer.
It is recommended to install the latest MLatom release as described in download page. See manual pages on this website for license and further instructions.
TorchANI
TorchANI program is required to provide the neural network (NN) part of AIQM1.
TorchANI is an open-source package. The latest version when writing this tutorial is v2.2.
Installation
1. install Numpy and nightly version of PyTorch (if you do not have them already):
pip install numpy
pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu100/torch_nightly.html
2. install TorchANI:
pip install torchani
Visit https://aiqm.github.io/torchani/ for more info. The latest version of TorchANI used when writing this tutorial is v2.2, you can install this version by pip install torchani==2.2
if there are any problems when running with the newest version of TorchANI. The CUDA extension for AEV calculation is not supported for the NN part of AIQM1 and ANI-1ccx now.
Other packages
To perform geometry optimizations or calculations of enthalpies of formation, MLatom relies on either Gaussian or ASE package (any of them can be used; the first one is commercial, the second one is open-source).
Gaussian
Our implementation work with both Gaussian 09 and Gaussian 16. It is a commercial program, which can be obtained and installed separately.
To use Gaussian interface, make sure that your environmental variable GAUSS_EXEDIR
points to the right place.
ASE
The ASE (Atomic Simulation Environment) are Python modules, which can be installed as described on ASE website.
Usage
To perform calculations, you can provide input files with appropriate options to MLatom (command line options also supported) and then run MLatom calculations as usual (see manual and tutorial pages on this website). Below we provide examples of some typical uses. In all cases, we use mlatom
as alias to $pathToMLatom/MLatom.py
(it is useful to setup such an alias in your shell).
The calculations of enthalpies of formation consist of two steps: geometry optimization and frequency calculations.
Geometry optimization
Geometry optimization of closed-shell molecules in electronic ground state is as simple as running single point calculations and 4-line MLatom input file, e.g., opt.inp
, looks like this:
ANI-1ccx xyzfile=opt.xyz
optxyz=final_geo.xyz geomopt optprog=gaussian # or optprog=ase if you choose ASE
This input requires opt.xyz
file with initial XYZ geometries of molecules to be optimized (you can provide many molecules as usual for MLatom), e.g., for hydrogen and methane opt.xyz
file can look like (geometries in Å):
2
H 0.0000000000 0.0000000000 0.0000000000
H 0.7414000000 0.0000000000 0.0000000000
5
C 0.0000000000 0.0000000000 0.0000000000
H 1.0870000000 0.0000000000 0.0000000000
H -0.3623333220 -1.0248334322 -0.0000000000
H -0.3623333220 0.5124167161 -0.8875317869
H -0.3623333220 0.5124167161 0.8875317869
After you prepared your input files
and opt
.inpopt.xyz
, you can run MLatom as usual:
mlatom opt.inp
After the calculations finish, the optimized geometries are saved in finla_geo.xyz.
If you use Gaussian, the Gaussian output files of geometry optimizations are saved in mol_1.log
, mol_2.log
, … files for each molecule, where you can find the optimized geometries and visualize them by GausianView program as in any other typical Gaussian job.
If you use ASE, you can directly obtain optimized XYZ geometries in a single file final_geo.xyz with MLatom format, which for our example looks like (geometries in Å):
2
H 0.15255733 0.00000000 0.00000000
H 0.58884267 0.00000000 0.00000000
5
C 0.00000000 0.00000000 0.00000000
H 1.08733372 -0.00000000 0.00000000
H -0.36244456 -1.02514806 0.00000000
H -0.36244456 0.51257403 -0.88780426
H -0.36244456 0.51257403 0.88780426
Frequency calculations
Thermochemical properties of closed-shell molecules in the electronic ground state can be calculated at ANI-1ccx or AIQM1 level by adding an argument freq
to the MLatom input file, e.g., freq.inp
, and they should be run on geometries optimized with the corresponding model. An example of MLatom input file using the ANI-1ccx-optimized geometries:
ANI-1ccx
xyzfile=optgeoms.xyz
freq
optprog=ASE # or optprog=gaussian if you choose Gaussian
ase.linear=1,0
ase.symmetrynumber=2,12
When ASE is used for the calculation of thermochemical properties, you should specify ase.linear
and ase.symmetrynumber
this two keywrods. ase.linear
is 0 for nonlinear molecule, 1 for linear molecule, and ase.symmetrynumber
is the rotational symmetry number for each molecule (see Table 10.1 and Appendix B of C. Cramer “Essentials of Computational Chemistry”, 2nd Ed.). For example, for hydrogen and methane this two molecules, you should set ase.linear=1,0
and ase.symmetrynumber=2,12
.
File with preoptimized geometries optgeoms.xyz
for our example are (geometries in Å):
2
hydrogen
H 0.15255733 0.00000000 0.00000000
H 0.58884267 0.00000000 0.00000000
5
methane
C 0.00000000 0.00000000 0.00000000
H 1.08733372 -0.00000000 0.00000000
H -0.36244456 -1.02514806 0.00000000
H -0.36244456 0.51257403 -0.88780426
H -0.36244456 0.51257403 0.88780426
After you prepared your input files freq.inp
and optgeoms.xyz
, you can run MLatom as usual:
mlatom freq.inp > freq.out
After the calculations finish, MLatom output freq.out
will contain the summary with atomization enthalpy at 0 K, ZPVE-exclusive atomization energy at 0 K, and heat of formation at 298.15 K for each molecule. If you use ASE, MLatom output will contain the same lines as above, but also include additional data such as entropy and the Gibbs free energy:
...... Zero-point vibrational energy : 4.07528 kcal/mol Atomization enthalpy at 0 K : 126.42974 kcal/mol ZPE exclusive atomization energy at 0 K : 130.50502 kcal/mol Heat of formation at 298.15 K : -23.11424 kcal/mol * Warning * Heat of formation have high uncertainty! ...... Zero-point vibrational energy : 27.87144 kcal/mol Atomization enthalpy at 0 K : 391.92513 kcal/mol ZPE exclusive atomization energy at 0 K : 419.79657 kcal/mol Heat of formation at 298.15 K : -17.63420 kcal/mol
If you use Gaussian, the Gaussian output files of frequency calculations are saved in mol_1.log
, mol_2.log
, … files for each molecule; these files contain ZPVE energy and lots of thermochemical data such as entropy and the Gibbs free energy.
Leave a Reply