Dataset for "Machine learning and semi-empirical calculations: A synergistic approach to rapid, accurate, and mechanism-based reaction barrier prediction"
All computed chemical structures used to build machine learning models to predict high-level chemical reaction barriers using low-level inputs.
Modern quantum mechanical modelling methods, such as Density Functional Theory (DFT), have provided detailed mechanistic insights into countless reactions and have been used in the design of a handful of chemical transformations. However, their computational cost inhibits their ability to rapidly screen large numbers of substrates and catalysts in reaction discovery. For a C-C bond forming Nitro-Michael addition, we introduce a synergistic semi-empirical quantum mechanical (SQM) and machine learning (ML) approach that achieves the fast and accurate prediction of DFT-quality free energy activation barriers using purely SQM-derived data. This dataset includes all the structural data, in the form of Gaussian16 (Revision A.03) output files, for the Nitro-Michael reaction used for this machine learning analysis.
Cite this dataset as:
Farrar, E.,
Grayson, M.,
2022.
Dataset for "Machine learning and semi-empirical calculations: A synergistic approach to rapid, accurate, and mechanism-based reaction barrier prediction".
Bath: University of Bath Research Data Archive.
Available from: https://doi.org/10.15125/BATH-01092.
Export
Data
structures.zip
application/zip (1GB)
Creative Commons: Attribution 4.0
Zip file containing optimised structures for each level of theory (AM1, AM1-IEFPCM, PM6, PM6-IEFPCM, UFF, wB97XD, wB97XD-IEFPCM). With the exception of IEFPCM folders, each level of theory includes the nucleophile (nuc), 1037 Michael acceptors (gs), and 1037 Michael addition transition states for the reaction of the nucleophile with each Michael acceptor (ts), complete with IEFPCM(toluene) single point energy calculations. IEFPCM folders include only the 37 literature reactions.
Contributors
University of Bath
Rights Holder
Documentation
Data collection method:
Uncatalysed reactant and transition state geometries for 1000 Nitro-Michael addition reactions were built using Schrödinger’s R-Group Enumeration by varying at four positions of a generic Michael acceptor core with common organic fragments. In addition to the nucleophile, uncatalysed reactant and transition state geometries for a further 37 biologically important Nitro-Michael addition reactions from literature were built in Gaussian16 (Revision A.03). All reactant and transition state structures were conformationally searched using Schrödinger’s MacroModel (version 12.7). All structures were subsequently optimised using Gaussian16 (Revision A.03) using several different molecular modelling methods.
Funders
Engineering and Physical Sciences Research Council
https://doi.org/10.13039/501100000266
DTP 2016-2017 University of Bath
EP/N509589/1
Publication details
Publication date: 14 June 2022
by: University of Bath
Version: 1
DOI: https://doi.org/10.15125/BATH-01092
URL for this record: https://researchdata.bath.ac.uk/id/eprint/1092
Related papers and books
Farrar, E. H. E., and Grayson, M. N., 2022. Machine learning and semi-empirical calculations: a synergistic approach to rapid, accurate, and mechanism-based reaction barrier prediction. Chemical Science, 13(25), 7594-7603. Available from: https://doi.org/10.1039/d2sc02925a.
Contact information
Please contact the Research Data Service in the first instance for all matters concerning this item.
Contact person: Elliot H E Farrar
Faculty of Science
Chemistry