Case study 1: ML-NEA spectrum for a single molecule
On this page you can find instructions how to follow the case study 1 given in the book chapter “Learning excited-state properties”:
Julia Westermayr*, Pavlo O. Dral, Philipp Marquetand. Learning excited-state properties. DOI: 10.1016/B978-0-323-90049-2.00004-4.
In Quantum Chemistry in the Age of Machine Learning, Pavlo O. Dral, Ed. Elsevier: 2023. DOI: 10.1016/B978-0-323-90049-2.00014-7.
Preparation
Download the package with the tutorial:
This package contains MLatom 2.0.5 and directories with input files and data for examples 1-3. All examples can be only run on a Linux machine and you should have Python 3.7+ installed.
You also need:
- install Newton-X (NX) (version = 2.2)
- use export NX=/path/to/Newton-X to define the environmental variable NX (in bash shell)
- install matplotlib with the command
python3 -m pip install matplotlib
Unpack the data by running the command
unzip tutorial_MLinQC22-ML-ES-props.zip
Go to the directory tutorial_MLinQC22-ML-ES-props
:
cd tutorial_MLinQC22-ML-ES-props
Example 1
Go to the directory with example 1:
cd example1
Run MLatom:
python3 ../MLatom_v2-0-5/MLatom.py ML-NEA.inp &> ML-NEA.out
Depending on your machine, it may take from couple to dozens of minutes. The output ML-NEA.out should contain the following lines:
==========================================================================================
run ML-NEA iteratively for spectrum generation ( ML_train_iter ) started at Wed Dec 1 12:00:19 2021 CST
ML-NEA iteration 1: train_number = 50; RMSE_geom = 0.06717941145022376; rRMSE = 1.0
ML-NEA iteration 2: train_number = 100; RMSE_geom = 0.09043318436728051; rRMSE = 0.25713761026721255
ML-NEA iteration 3: train_number = 150; RMSE_geom = 0.06411060145373663; rRMSE = 0.410580813729204
ML-NEA iteration 4: train_number = 200; RMSE_geom = 0.0695737045717655; rRMSE = 0.07852252732055763
ML-NEA iteration ended after 4 iteration!
run ML-NEA iteratively for spectrum generation ( ML_train_iter ) finished at Wed Dec 1 12:08:01 2021 CST |||| total spent 462.02 sec
==========================================================================================
After the calculations finished, the spectra are plotted to plot.png
file in the cross-section
sub-directory. You can open and check it with your favorite image viewer, e.g.:
gwenview cross-section/plot.png
It should look like:
Example 2
Go to the directory with example 2:
cd example
2
Run MLatom to calculate the spectrum with 2k training points and compare the resulting spectrum to the ML-NEA spectrum generated in example 1 (i.e. with 200 training points):
python3 ../MLatom_v2-0-5/MLatom.py ML-NEA.inp &> ML-NEA.out
Depending on your machine, it may take from couple to dozens of minutes. The output ML-NEA.out should contain the following lines:
========================================================================================== run ML-NEA iteratively for spectrum generation ( ML_train_iter ) started at Wed Dec 1 12:00:19 2021 CST ========================================================================================== use all QC points to run ML-NEA calculations ( ML_train_all ) started at Wed Dec 1 12:20:25 2021 CST RMSE_geom value for 2000 point: 0.05609438652907553 use all QC points to run ML-NEA calculations ( ML_train_all ) finished at Wed Dec 1 12:44:40 2021 CST |||| total spent 1454.98 sec ==========================================================================================
After the calculations finished, the spectra are plotted to plot.png
file in the cross-section
sub-directory. You can open and check it with your favorite image viewer, e.g.:
gwenview cross-section/plot.png
It should look like:
Example 3
Go to the directory with example 3:
cd example3
Run MLatom to calculate the spectrum with 100k points in ensemble and compare the resulting spectrum to the ML-NEA spectrum generated in example 1 (i.e. with 50k points in ensemble):
python3 ../MLatom_v2-0-5/MLatom.py ML-NEA.inp &> ML-NEA.out
Depending on your machine, it may take from couple to dozens of minutes. The output ML-NEA.out should contain the following lines:
========================================================================================== use all QC points to run ML-NEA calculations ( ML_train_all ) started at Wed Dec 1 13:25:24 2021 CST RMSE_geom value for 200 point: 0.0695737045717655 use all QC points to run ML-NEA calculations ( ML_train_all ) finished at Wed Dec 1 13:33:41 2021 CST |||| total spent 496.45 sec ==========================================================================================
After the calculations finished, the spectra are plotted to plot.png
file in the cross-section
sub-directory. You can open and check it with your favorite image viewer, e.g.:
gwenview cross-section/plot.png
It should look like:
Leave a Reply