Dataset for "Distortion/Interaction Analysis via Machine Learning"
Machine learning (ML) has previously been applied to predict reaction barriers for a variety of different chemical reactions. This is seen as the end point for this type of study however, post-reaction barrier analysis/energy decomposition approaches can provide insight into chemical reactivity. One such approach that has previously been used to provide information on chemical reactivity, for cycloaddition reactions in particular, is distortion/interaction-activation strain analysis (DIAS). We demonstrate that ML can be coupled with cheap and rapid semi-empirical quantum mechanical methods (SQM) to predict distortion and interaction energies at a fraction of the computational cost associated with running density functional theory (DFT) calculations. This dataset includes all the structural data in the form of Gaussian16 (Revision A.03 and C.01) output files for the four datasets used in this work and, the literature dataset reactions.
Cite this dataset as:
Espley, S.,
Allsop, S.,
2024.
Dataset for "Distortion/Interaction Analysis via Machine Learning".
Bath: University of Bath Research Data Archive.
Available from: https://doi.org/10.15125/BATH-01398.
Export
Data
data_archive.zip
application/zip (4GB)
Creative Commons: Attribution 4.0
A zipped directory containing distortion/interaction calculations for four datasets: nitro-Michael addition (MA), Diels-Alder, [3+2] cycloaddition, and dimethyl malonate MA. These calculations have been performed at both AM1 and the DFT level of theory of the original dataset. For the dimethyl malonate MA dataset, the reactant and transition structure geometries are also provided. These calculations were performed at AM1 and wB97X-D/def2-TZVP (IEFPCM=Water)//wB97X-D/def2-TZVP.
Contributors
David Buttar
Supervisor
AstraZeneca
Simone Tomasi
Supervisor
AstraZeneca
Matthew Grayson
Supervisor
University of Bath
University of Bath
Rights Holder
AstraZeneca
Rights Holder
Documentation
Data collection method:
Ground state reactant and transition state geometries for dimethyl malonate Michael addition reactions were built using Schrödinger’s R-Group Enumeration. R-groups were placed on various different positions of the Michael acceptor; the position depended upon the molecules in question. All structures were built in Gaussian16 (Revisions A.03 and C.01) and were conformationally searched using Schrödinger’s MacroModel (version 12.7). All structures were subsequently optimised using Gaussian16 (Revisions A.03 and C.01) using AM1 and wB97X-D/def2-TZVP (IEFPCM=Water)//wB97X-D/def2-TZVP. For distortion/interaction-activation strain calculations, python code (available on the associated GitHub page: https://github.com/the-grayson-group/distortion-interaction_ML) was used to separate the distorted reactant structures before single point energies were calculated using Gaussian16 (Revision C.01) using AM1 and the DFT level of theory used in the original transition structure calculation.
Funders
Engineering and Physical Sciences Research Council (EPSRC)
https://doi.org/10.13039/501100000266
Industrial CASE Account - University of Bath 2020
EP/V519637/1
Engineering and Physical Sciences Research Council (EPSRC)
https://doi.org/10.13039/501100000266
Machine Learning and Molecular Modelling: A Synergistic Approach to Rapid Reactivity Prediction
EP/W003724/1
Publication details
Publication date: 21 October 2024
by: University of Bath
Version: 1
DOI: https://doi.org/10.15125/BATH-01398
URL for this record: https://researchdata.bath.ac.uk/id/eprint/1398
Related papers and books
Espley, S. G., Allsop, S. S., Buttar, D., Tomasi, S., and Grayson, M. N., 2024. Distortion/interaction analysis via machine learning. Digital Discovery. Available from: https://doi.org/10.1039/d4dd00224e.
Related datasets and code
Espley, S., and Farrar, E., 2023. Dataset for "Machine learning reaction barriers in low data regimes: a horizontal and diagonal transfer learning approach". Version 1. Bath: University of Bath Research Data Archive. Available from: https://doi.org/10.15125/BATH-01229.
Contact information
Please contact the Research Data Service in the first instance for all matters concerning this item.
Contact person: Sam Espley
Faculty of Science
Chemistry