Dataset for "Computational Modelling and Machine Learning Approaches Towards Understanding Asymmetric Catalytic Organic Reactions"

All computed chemical structures used to model several chemical reactions and build machine learning models to predict high-level chemical reaction barriers using low-level inputs.

For several decades, chemical modelling methods such as density functional theory (DFT) have provided invaluable contributions to the understanding of asymmetric and catalytic reactions. However, machine learning (ML) models, once trained, could allow for much more rapid screening of chemical reactions. In the thesis associated with this dataset, research into two distinct approaches to understanding organic reactions - modelling and ML - are presented, including several examples of conventional modelling with DFT, as well as details of a new ML methodology that bridges the gap between semi-empirical quantum mechanical (SQM) methods and DFT. This repository contains Gaussian09 and Gaussian16 output files for all computed structures used in this work, as well as a document containing a complete list of all metrics, features, and hyperparameters for all computed ML models.

Keywords:
Computational chemistry, Asymmetric organocatalysis, Computational modelling, Density functional theory, Machine learning
Subjects:
Chemical reaction dynamics and mechanisms
Chemical synthesis

Cite this dataset as:
Farrar, E., 2022. Dataset for "Computational Modelling and Machine Learning Approaches Towards Understanding Asymmetric Catalytic Organic Reactions". Bath: University of Bath Research Data Archive. Available from: https://doi.org/10.15125/BATH-01148.

Export

[QR code for this page]

Data

data.zip
application/zip (1GB)
Creative Commons: Attribution 4.0

Zip file containing Gaussian09 and Gaussian16 output files for five distinct projects included in the associated thesis, as well as a document with all machine learning model metrics, features, and hyperparameters. A README.txt file is also included that details the structure of the folders and files provided.

Creators

Elliot Farrar
University of Bath

Contributors

Matthew Grayson
Supervisor
University of Bath

University of Bath
Rights Holder

Coverage

Collection date(s):

From 1 September 2018 to 26 May 2022

Documentation

Data collection method:

Structures were generated by conformational searching with Schrödinger's MacroModel (Version 11.3, 11.6, 12.7) and enumeration with Schrödinger’s R-Group Enumeration. These structures were optimised with Gaussian09 (Revision D.01) (performed at the University of Cambridge) and Gaussian16 (Revision A.03) (performed at the University of Bath) using several different molecular modelling methods. Additional analyses were performed with NCIPLOT. Machine learning models were built using the Scikit-learn and mlxtend python packages.

Documentation Files

README.txt
text/plain (1kB)
Creative Commons: Attribution 4.0

Documentation of folder structure

Funders

Engineering and Physical Sciences Research Council
https://doi.org/10.13039/501100000266

DTP 2018-19 University of Bath
EP/R513155/1

Publication details

Publication date: 12 August 2022
by: University of Bath

Version: 1

DOI: https://doi.org/10.15125/BATH-01148

URL for this record: https://researchdata.bath.ac.uk/id/eprint/1148

Related papers and books

Momo, P. B., Leveille, A. N., Farrar, E. H. E., Grayson, M. N., Mattson, A. E., and Burtoloso, A. C. B., 2020. Enantioselective S−H Insertion Reactions of α‐Carbonyl Sulfoxonium Ylides. Angewandte Chemie International Edition, 59(36), 15554-15559. Available from: https://doi.org/10.1002/anie.202005563.

Lerchen, A., Gandhamsetty, N., Farrar, E. H. E., Winter, N., Platzek, J., Grayson, M. N., and Aggarwal, V. K., 2020. Enantioselective Total Synthesis of (−)‐Finerenone Using Asymmetric Transfer Hydrogenation. Angewandte Chemie International Edition, 59(51), 23107-23111. Available from: https://doi.org/10.1002/anie.202011256.

Farrar, E. H. E., and Grayson, M. N., 2020. Computational Studies of Chiral Hydroxyl Carboxylic Acids: The Allylboration of Aldehydes. The Journal of Organic Chemistry, 85(23), 15449-15456. Available from: https://doi.org/10.1021/acs.joc.0c02226.

Related theses

Farrar, E., 2022. Computational Modelling and Machine Learning Approaches Towards Understanding Asymmetric Catalytic Organic Reactions. Thesis (PhD). University of Bath. Available from: https://researchportal.bath.ac.uk/en/studentTheses/computational-modelling-and-machine-learning-approaches-towards-u.

Contact information

Please contact the Research Data Service in the first instance for all matters concerning this item.

Contact person: Elliot Farrar

Departments:

Faculty of Science
Chemistry