Dataset for "Reformulating Reactivity Design for Data-Efficient Machine Learning"

This dataset contains the Gaussian 16 output files for the dataset of aza-Michael addition reactions used in the publication "Fast Identification of Reactions with Desired Barriers by Reformulating Machine Learning Activation Energies". The structures of the methylamine nucleophile, the 1000 Michael acceptor electrophiles and their 1000 transition states were all optimised at the wB97X-D/def2-TZVP level of theory with the IEFPCM(water) implicit solvent model. Before optimisation all Michael acceptors and transition states were conformationally searched using the MMFF force field in Schrödinger's MacroModel software and the lowest energy conformer was selected for DFT calculation. This dataset also contains the Gaussian 16 output files for the SVWN/def2-SVP single-point energy calculations on the dihydrogen activation catalyst and transition state structures.

Keywords:
Machine learning, Activation barriers, Guided barrier search
Subjects:
Chemical synthesis

Cite this dataset as:
Lewis-Atwell, T., Beechey, D., Şimşek, Ö., Grayson, M., 2023. Dataset for "Reformulating Reactivity Design for Data-Efficient Machine Learning". Bath: University of Bath Research Data Archive. Available from: https://doi.org/10.15125/BATH-01240.

Export

[QR code for this page]

Data

electrophiles.zip
application/zip (104MB)
Creative Commons: Attribution 4.0

Gaussian 16 output files for the 1000 Michael acceptor reactants, optimised at the wB97X-D/def2-TZVP level of theory with the IEFPCM(water) implicit solvent model.

transitionstates.zip
application/zip (161MB)
Creative Commons: Attribution 4.0

Gaussian 16 output files for the 1000 aza-Michael addition reaction transition states, optimised at the wB97X-D/def2-TZVP level of theory with the IEFPCM(water) implicit solvent model.

methylamine.out
text/plain (115kB)
Creative Commons: Attribution 4.0

Gaussian 16 output file for the methylamine nucleophile, optimised at the wB97X-D/def2-TZVP level of theory with the IEFPCM(water) implicit solvent model.

electrophiles_pm6.zip
application/zip (8MB)
Creative Commons: Attribution 4.0

Gaussian 16 output files for the 1000 Michael acceptor reactants, single-point energies using the PM6 semi-empirical method with the IEFPCM(water) implicit solvent model.

transitionstates_pm6.zip
application/zip (9MB)
Creative Commons: Attribution 4.0

Gaussian 16 output files for the 1000 aza-Michael addition reaction transition states, single-point energies using the PM6 semi-empirical method with the IEFPCM(water) implicit solvent model.

methylamine_pm6.out
text/plain (104kB)
Creative Commons: Attribution 4.0

Gaussian 16 output file for the methylamine nucleophile, optimised using the PM6 semi-empirical method with the IEFPCM(water) implicit solvent model.

catalysts_lda.zip
application/zip (28MB)
Creative Commons: Attribution 4.0

Gaussian 16 output files for the Vaska's complex iridium catalysts, single-point energies at the SVWN/def2-SVP level of theory.

dihydrogen_lda.zip
application/zip (28MB)
Creative Commons: Attribution 4.0

Gaussian 16 output files for the dihydrogen activation catalyst transition states, single-point energies at the SVWN/def2-SVP level of theory.

h2.out
text/plain (71kB)
Creative Commons: Attribution 4.0

Gaussian 16 output file for the dihydrogen molecule, optimised at the SVWN/def2-SVP level of theory.

Github repository for the code used in the experiments described in the publication.

Creators

Daniel Beechey
University of Bath

Contributors

University of Bath
Rights Holder

Documentation

Data collection method:

1000 Michael acceptor structures and their transition states for their reactions with methylamine were generated according the the scheme shown in the image "michael_structures.png" using the “R-Group Creator” and “Custom R-Group Enumeration” tools from Schrödinger's Maestro. The resulting Michael acceptors and transition states were conformationally searched using Schrödinger's MacroModel with the MMFF force field and the lowest energy electrophile and transition state conformers were selected for DFT optimisation. Gaussian 16 was used to perform geometry optimisation of the selected conformers as well as the methylamine nucleophile at the wB97X-D/def-TZVP level of theory with the IEFPCM(water) solvent model. Gaussian 16 was also used to perform single-point energy calculations on the Michael acceptor and transition state structures using the PM6 semi-empirical method with the IEFPCM(water) solvent model. Gaussian 16 was used to perform single-point energy calculations at the SVWN/def2-SVP level of theory on all of the transition state and catalyst structures available from the "Vaska's space" dataset (https://doi.org/10.5683/SP2/CJS7QA).

Technical details and requirements:

“R-Group Creator” and “Custom R-Group Enumeration” tools from Schrödinger Maestro v12.5. “Conformational Search” tool from Schrödinger MacroModel v12.9. Gaussian 16, Revision A.03 and Revision C.01.

Additional information:

The "electrophiles.zip" file contains the Gaussian output files for the optimised Michael acceptor structures. The "transitionstates.zip" file contains the Gaussian output files for the optimised aza-Michael addition transition state structures. The "methylamine.out" file is the Gaussian output file for the optimised methylamine nucleophile structure. The "electrophiles_pm6.zip" file contains the Gaussian output files for the PM6 single-point energies for the Michael acceptors. The "transitionstates_pm6.zip" file contains the Gaussian output files for the PM6 single-point energies for aza-Michael addtion transition states. The "methylamine_pm6.out" file is the Gaussian output file for the PM6-optimised methylamine nucleophile structure. The "catalysts_lda.zip" file contains the Gaussian output files for the single-point LDA iridium catalyst energies. The "dihydrogen_lda.zip" file contains the Gaussian output files for the single-point LDA dihydrogen activation transition state energies. The "h2.out" file is the Gaussian output file for the LDA-optimised dihydrogen molecule.

Documentation Files

michael_structures.png
image/png (35kB)
Creative Commons: Attribution 4.0

Scheme used to generate Michael acceptor structures. The functional groups next to the "R1-R3" and "R4" show which functional groups are allowed in each position.

Funders

UK Research and Innovation
https://doi.org/10.13039/100014013

UKRI Centre for Doctoral Training in Accountable, Responsible and Transparent AI
EP/S023437/1

Engineering and Physical Sciences Research Council
https://doi.org/10.13039/501100000266

Machine Learning and Molecular Modelling: A Synergistic Approach to Rapid Reactivity Prediction
EP/W003724/1

Publication details

Publication date: 6 October 2023
by: University of Bath

Version: 1

DOI: https://doi.org/10.15125/BATH-01240

URL for this record: https://researchdata.bath.ac.uk/id/eprint/1240

Related papers and books

Lewis-Atwell, T., Beechey, D., Şimşek, Ö., and Grayson, M. N., 2023. Reformulating Reactivity Design for Data-Efficient Machine Learning. ACS Catalysis, 13(20), 13506-13515. Available from: https://doi.org/10.1021/acscatal.3c02513.

Contact information

Please contact the Research Data Service in the first instance for all matters concerning this item.

Contact person: Matthew Grayson

Departments:

Faculty of Science
Chemistry
Computer Science

Research Centres & Institutes
UKRI CDT in Accountable, Responsible and Transparent AI