Data in MLatom’s Python API
Data is cornerstone of MLatom as it is a data-driven package. To truly unlock the potential of MLatom, you need to master handling of data with its Python API.
In our online tutorial, we give a primer into MLatom’s data and you will learn, for example, how to view normal mode vibrations or MD in Jupyter with a single line.
Molecule is a central concept in MLatom as most of operations are usually done on molecule
class objects. These objects store information about the constituent atom
objects, their coordinates, and any property we want to learn or calculate. Molecules can be loaded and dumped in various formats.
In machine learning, we need data with many molecules. MLatom for this uses the molecular_database
class. The databases can be loaded and dumped in different formats. The useful feature of molecular databases is that they can be manipulated as lists/numpy arrays or split into several other databases. This is useful, for example, when we need to split the data into training and test sets.
Molecular dynamics trajectories are handled with a dedicated molecular_trajectory
class. The trajectory step contains the usual information about the molecule, step number, time, energies, and so on.
Just to mention that if you are dealing with the excited states and surface hopping trajectories, MLatom has an intuitive access to complex properties through molecule.electronic_states
.
Leave a Reply