journal article Sep 01, 2023

Determining best practices for using genetic algorithms in molecular discovery

View at Publisher Save 10.1063/5.0158053
Abstract
Genetic algorithms (GAs) are a powerful tool to search large chemical spaces for inverse molecular design. However, GAs have multiple hyperparameters that have not been thoroughly investigated for chemical space searches. In this tutorial, we examine the general effects of a number of hyperparameters, such as population size, elitism rate, selection method, mutation rate, and convergence criteria, on key GA performance metrics. We show that using a self-termination method with a minimum Spearman’s rank correlation coefficient of 0.8 between generations maintained for 50 consecutive generations along with a population size of 32, a 50% elitism rate, three-way tournament selection, and a 40% mutation rate provides the best balance of finding the overall champion, maintaining good coverage of elite targets, and improving relative speedup for general use in molecular design GAs.
Topics

No keywords indexed for this article. Browse by subject →

References
58
[1]
A. Nigam , R.Pollice, G.Tom, K.Jorner, L. A.Thiede, A.Kundaje, and A.Aspuru-Guzik, “Tartarus: A benchmarking platform for realistic and practical inverse molecular design,” arXiv:2209.12487 (2022).
[2]
Parallel tempered genetic algorithm guided by deep neural networks for inverse molecular design

AkshatKumar Nigam, Robert Pollice, Alán Aspuru-Guzik

Digital Discovery 2022 10.1039/d2dd00003b
[3]
"Computational evolution of high-performing unfused non-fullerene acceptors for organic solar cells" J. Chem. Phys. (2022) 10.1063/5.0087299
[4]
"Virtual screening of norbornadiene-based molecular solar thermal energy storage systems using a genetic algorithm" J. Chem. Phys. (2021) 10.1063/5.0063694
[5]
Inverse molecular design using machine learning: Generative models for matter engineering

Benjamín Sánchez-Lengeling, Alán Aspuru-Guzik

Science 2018 10.1126/science.aat2663
[6]
B. Sanchez-Lengeling , C.Outeiral, G. L.Guimaraes, and A.Aspuru-Guzik, “Optimizing distributions over molecular space. An objective-reinforced generative adversarial network for inverse-design chemistry (ORGANIC),” chemRxiv:5309668.v3 (2017). 10.26434/chemrxiv.5309668
[7]
"A genetic algorithm for the automated generation of small organic molecules: Drug design using an evolutionary algorithm" J. Comput.-Aided Mol. Des. (2000) 10.1023/a:1008108423895
[8]
"Pareto optimization of oligomer polarizability and dipole moment using a genetic algorithm" J. Phys. Chem. A (2022) 10.1021/acs.jpca.2c01266
[9]
"Using genetic algorithms to discover novel ground-state triplet conjugated polymers" Phys. Chem. Chem. Phys. (2023) 10.1039/d3cp00185g
[10]
"A graph-based genetic algorithm and generative model/Monte Carlo tree search for the exploration of chemical space" Chem. Sci. (2019) 10.1039/c8sc05372c
[11]
"Discovery and optimization of materials using evolutionary approaches" Chem. Rev. (2016) 10.1021/acs.chemrev.5b00691
[12]
"Heuristic global optimization in chemical compound space" J. Phys. Chem. A (2020) 10.1021/acs.jpca.0c05941
[13]
"Screening efficient tandem organic solar cells with machine learning and genetic algorithms" J. Phys. Chem. C (2023) 10.1021/acs.jpcc.3c00267
[14]
"A genetic algorithm approach to design principles for organic photovoltaic materials" Adv. Theory Simul. (2020) 10.1002/adts.202000042
[15]
E. S. Henault , M. H.Rasmussen, and J. H.Jensen, “Chemical space exploration: How genetic algorithms find the needle in the haystack,” PeerJ Phys. Chem.2, e11 (2020).10.7717/peerj-pchem.11 10.7717/peerj-pchem.11
[16]
Computational Design and Selection of Optimal Organic Photovoltaic Materials

Noel M. O’Boyle, Casey M. Campbell, Geoffrey R. Hutchison

The Journal of Physical Chemistry C 2011 10.1021/jp202765c
[17]
"Genetic algorithm design of MOF-based gas sensor arrays for CO2-in-air sensing" Sensors (2020) 10.3390/s20030924
[18]
"Genetic algorithms in chemistry" Chemom. Intell. Lab. Syst. (1993) 10.1016/0169-7439(93)80028-g
[19]
"Evolving better nanoparticles: Genetic algorithms for optimising cluster geometries" Dalton Trans. 10.1039/b305686d
[20]
"Genetic algorithms in chemistry" J. Chromatogr. A (2007) 10.1016/j.chroma.2007.04.025
[21]
"GAtor: A first-principles genetic algorithm for molecular crystal structure prediction" J. Chem. Theory Comput. (2018) 10.1021/acs.jctc.7b01152
[22]
"The XtalOpt evolutionary algorithm for crystal structure prediction" J. Phys. Chem. C (2021) 10.1021/acs.jpcc.0c09531
[23]
"GAMaterial—A genetic-algorithm software for material design and discovery" J. Comput. Chem. (2023) 10.1002/jcc.27043
[24]
"Global optimization of atomic structure enhanced by machine learning" Phys. Rev. B (2022) 10.1103/physrevb.105.245404
[25]
"A Fukui function-guided genetic algorithm. Assessment on structural prediction of Sin (n = 12–20) clusters" J. Comput. Chem. (2017) 10.1002/jcc.24810
[26]
"RDKit: Open-source cheminformatics" (2022)
[27]
"Effect of the genetic algorithm parameters on the optimisation of heterogeneous catalysts" QSAR Comb. Sci. (2005) 10.1002/qsar.200420058
[28]
"Reducing bias and inefficiency in the selection algorithm"
[29]
"Adaptive selection methods for genetic algorithms"
[30]
"Using a genetic algorithm to find molecules with good docking scores" PeerJ Phys. Chem. (2021) 10.7717/peerj-pchem.18
[31]
"In silico prediction of hemolytic toxicity on the human erythrocytes for small molecules by machine-learning and genetic algorithm" J. Med. Chem. (2020) 10.1021/acs.jmedchem.9b00853
[32]
"Evolutionary design of molecules based on deep learning and a genetic algorithm" Sci. Rep. (2021) 10.1038/s41598-021-96812-8
[33]
"Simultaneous shape and stacking sequence optimization of laminated composite free-form shells using multi-island genetic algorithm" Adv. Civ. Eng. 10.1155/2019/2056460
[34]
Efficient Computational Screening of Organic Polymer Photovoltaics

Ilana Y. Kanal, Steven G. Owens, Jonathon S. Bechtel et al.

The Journal of Physical Chemistry Letters 2013 10.1021/jz400215j
[35]
A. Nigam , P.Friederich, M.Krenn, and A.Aspuru-Guzik, “Augmenting genetic algorithms with deep neural networks for exploring the chemical space,” arXiv:1909.11655 [physics] (2020).
[36]
"Illuminating elite patches of chemical space" Chem. Sci. (2020) 10.1039/d0sc03544k
[37]
"Graph-based molecular Pareto optimisation" Chem. Sci. (2022) 10.1039/d2sc00821a
[38]
"Organic photoredox catalysts for Co2 reduction: Driving discovery with genetic algorithms" J. Chem. Phys. (2022) 10.1063/5.0088353
[39]
"Optimization configuration of selective solar absorber using multi-island genetic algorithm" Sol. Energy (2021) 10.1016/j.solener.2021.06.059
[40]
"Intelligent selection of metal–organic framework arrays for methane sensing via genetic algorithms" ACS Sens. (2019) 10.1021/acssensors.9b00268
[41]
"Genetic algorithm based design and experimental characterization of a highly thermostable metalloprotein" J. Am. Chem. Soc. (2018) 10.1021/jacs.7b10660
[42]
"Genetic algorithm approach for the optimization of protein antifreeze activity using molecular simulations" J. Chem. Theory Comput. (2020) 10.1021/acs.jctc.0c00773
[43]
"Automatic conformational search of transition states for catalytic reactions using genetic algorithm" J. Phys. Chem. A (2019) 10.1021/acs.jpca.9b09543
[44]
I. Y. Kanal and G. R.Hutchison, “Rapid computational optimization of molecular properties using genetic algorithms: Searching across millions of compounds for organic photovoltaic materials,” arXiv:1707.02949 (2017).
[45]
(2003)
[46]
"Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94" J. Comput. Chem. (1996) 10.1002/(sici)1096-987x(199604)17:5/6<490::aid-jcc1>3.0.co;2-p
[47]
Open Babel: An open chemical toolbox

Noel M O'Boyle, Michael Banck, Craig A James et al.

Journal of Cheminformatics 2011 10.1186/1758-2946-3-33
[48]
Extended tight‐binding quantum chemistry methods

Christoph Bannwarth, Eike Caldeweyher, Sebastian Ehlert et al.

WIREs Computational Molecular Science 2021 10.1002/wcms.1493
[49]
GFN2-xTB—An Accurate and Broadly Parametrized Self-Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions

Christoph Bannwarth, Sebastian Ehlert, Stefan Grimme

Journal of Chemical Theory and Computation 2019 10.1021/acs.jctc.8b01176

Showing 50 of 58 references

Metrics
21
Citations
58
References
Details
Published
Sep 01, 2023
Vol/Issue
159(9)
Funding
Basic Energy Sciences Award: DE-SC0019335
Cite This Article
Brianna L. Greenstein, Danielle C. Elsey, Geoffrey R. Hutchison (2023). Determining best practices for using genetic algorithms in molecular discovery. The Journal of Chemical Physics, 159(9). https://doi.org/10.1063/5.0158053
Related

You May Also Like