Wong, F. et al. Leveraging artificial intelligence in the fight against infectious diseases. Science 381, 164–170 (2023).
Google Scholar
Wan, F., Wong, F., Collins, J. J. & de la Fuente-Nunez, C. Machine learning for antimicrobial peptide identification and design. Nat. Rev. Bioeng. 2, 392–407 (2024).
Google Scholar
Silver, D. et al. Mastering the game of Go without human knowledge. Nature 550, 354–359 (2017).
Google Scholar
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Google Scholar
Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).
Google Scholar
Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).
Google Scholar
Wong, F. et al. Discovery of a structural class of antibiotics with explainable deep learning. Nature 626, 177–185 (2024).
Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Google Scholar
Bengio, Y., Lodi, A. & Prouvost, A. Machine learning for combinatorial optimization: a methodological tour d’horizon. Eur. J. Oper. Res. 290, 405–421 (2021).
Google Scholar
Bohacek, R. S., McMartin, C. & Guida, W. C. The art and practice of structure-based drug design: a molecular modeling perspective. Med. Res. Rev. 16, 3–50 (1996).
Google Scholar
Yang, X., Wang, Y., Byrne, R., Schneider, G. & Yang, S. Concepts of artificial intelligence for computer-assisted drug discovery. Chem. Rev. 119, 10520–10594 (2019).
Google Scholar
Burbidge, R., Trotter, M., Buxton, B. & Holden, S. Drug design by machine learning: support vector machines for pharmaceutical data analysis. Comput. Chem. 26, 5–14 (2001).
Google Scholar
Warmuth, M. K. et al. Active learning with support vector machines in the drug discovery process. J. Chem. Inf. Comput. Sci. 43, 667–673 (2003).
Google Scholar
Zernov, V. V., Balakin, K. V., Ivaschenko, A. A., Savchuk, N. P. & Pletnev, I. V. Drug discovery using support vector machines. The case studies of drug-likeness, agrochemical-likeness, and enzyme inhibition predictions. J. Chem. Inf. Comput. Sci. 43, 2048–2056 (2003).
Google Scholar
Sadybekov, A. V. & Katritch, V. Computational approaches streamlining drug discovery. Nature 616, 673–685 (2023).
Google Scholar
Yang, K. et al. Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 59, 3370–3388 (2019).
Google Scholar
Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180, 688–702 (2020).
Google Scholar
Liu, G. et al. Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii. Nat. Chem. Biol. 19, 1342–1350 (2023).
Google Scholar
Zheng, E. J. et al. Discovery of antibiotics that selectively kill metabolically dormant bacteria. Cell. Chem. Biol. 31, 712–728.e9 (2024).
Google Scholar
Melo, M. C. R., Maasch, J. R. M. A. & de la Fuente-Nunez, C. Accelerating antibiotic discovery through artificial intelligence. Commun. Biol. 4, 1050 (2021).
Google Scholar
Cesaro, A., Bagheri, M., Torres, M., Wan, F. & de la Fuente-Nunez, C. Deep learning tools to accelerate antibiotic discovery. Expert Opin. Drug Discov. 18, 1245–1257 (2023).
Google Scholar
Krishnan, S. R. et al. De novo design of anti-tuberculosis agents using a structure-based deep learning method. J. Mol. Graph. Model. 118, 108361 (2023).
Google Scholar
Wong, F. et al. Discovering small-molecule senolytics with deep neural networks. Nat. Aging 3, 734–750 (2023).
Google Scholar
Jin, W. et al. Deep learning identifies synergistic drug combinations for treating COVID-19. Proc. Natl Acad. Sci. USA 118, e2105070118 (2021).
Google Scholar
Preuer, K. et al. DeepSynergy: predicting anti-cancer drug synergy with deep learning. Bioinformatics 34, 1538–1546 (2018).
Google Scholar
Wan, F., Kontogiorgos-Heintz, D. & de la Fuente-Nunez, C. Deep generative models for peptide design. Digit. Discov. 1, 195–208 (2022).
Google Scholar
De Cao, N. & Kipf, T. MolGAN: an implicit generative model for small molecular graphs. In ICML 2018 Workshop on Theoretical Foundations and Applications of Deep Generative Models (2018).
Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268–276 (2018).
Google Scholar
Jin, W., Barzilay, R. & Jaakkola, T. In Proc. 35th International Conference on Machine Learning 2323–2332 (2018).
Blaschke, T. et al. REINVENT 2.0: an AI tool for de novo drug design. J. Chem. Inf. Model. 60, 5918–5922 (2020).
Google Scholar
Zhou, Z., Kearnes, S., Li, L., Zare, R. N. & Riley, P. Optimization of molecules via deep reinforcement learning. Sci. Rep. 9, 10752 (2019).
Google Scholar
Zeng, X. et al. Deep generative molecular design reshapes drug discovery. Cell Rep. Med. 3, 100794 (2022).
Google Scholar
Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37, 1038–1040 (2019).
Google Scholar
Ying, R., Bourgeois, D., You, J., Zitnik, M. & Leskovic, J. GNNExplainer: generating explanations for graph neural networks. Adv. Neural Inf. Process. Syst. 32, 9240–9251 (2019).
Google Scholar
Jiménez-Luna, J., Grisoni, F. & Schneider, G. Drug discovery with explainable artificial intelligence. Nat. Mach. Intell. 2, 573–584 (2020).
Google Scholar
Yuan, H., Yu, H., Gui, S. & Ji, S. Explainability in graph neural networks: a taxonomic survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 5782–5799 (2023).
Google Scholar
Yuan, H., Yu., H., Wang, J., Li, K. & Ji, S. In Proc. 38th International Conference on Machine Learning 12241–12252 (2021).
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215 (2019).
Google Scholar
Gilmer, J. et al. In Proc. 34th International Conference on Machine Learning 1263–1272 (2017).
Wu, Z. et al. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32, 2162–2388 (2021).
Google Scholar
Zhou, J. et al. Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020).
Google Scholar
Reiser, P. et al. Graph neural networks for materials science and chemistry. Commun. Mater. 3, 93 (2022).
Google Scholar
Heid, E. & Green, W. H. Machine learning of reaction properties via learned representations of the condensed graph of reaction. J. Chem. Inf. Model. 62, 2101–2110 (2022).
Google Scholar
Jin, W., Barzilay, R. & Jaakkola, T. In Proc. 37th International Conference on Machine Learning 4849–4859 (2020).
Heid, E. et al. Chemprop: a machine learning package for chemical property prediction. J. Chem. Inf. Model. 64, 9–17 (2024).
Google Scholar
Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo Tree Search. In Computers and Games (CG 2006). Lecture Notes in Computer Science (eds van den Herik, H. J. et al.) 4630, 72–83 (Springer, 2007)
Tingle, B. I. et al. ZINC-22—a free multi-billion-scale database of tangible compounds for ligand discovery. J. Chem. Inf. Model. 63, 1166–1176 (2023).
Google Scholar
Verheij, H. J. Leadlikeness and structural diversity of synthetic screening libraries. Mol. Divers. 10, 377–388 (2006).
Google Scholar
Krier, M., Bret, G. & Rognan, D. Assessing the scaffold diversity of screening libraries. J. Chem. Inf. Model. 46, 512–524 (2006).
Google Scholar
Swanson, K. et al. ADMET-AI: a machine learning ADMET platform for evaluation of large-scale chemical libraries. Bioinformatics 40, btae416 (2024).
McGill, C., Forsuelo, M., Guan, Y. & Green, W. H. Predicting infrared spectra with message passing neural networks. J. Chem. Inf. Model. 61, 2594–2609 (2021).
Google Scholar
Swinney, D. C. & Anthony, J. How were new medicines discovered. Nat. Rev. Drug Discov. 10, 507–519 (2011).
Google Scholar
Swinney, D. C. Phenotypic vs. target-based drug discovery for first-in-class medicines. Clin. Pharmacol. Ther. 93, 299–301 (2013).
Google Scholar
Moffat, J. G., Vincent, F., Lee, J. A., Eder, J. & Prunotto, M. Opportunities and challenges in phenotypic drug discovery: an industry perspective. Nat. Rev. Drug Discov. 16, 531–543 (2017).
Google Scholar
Muratov, E. N. et al. QSAR without borders. Chem. Soc. Rev. 49, 3525–3564 (2020).
Google Scholar
Wong, F. et al. Benchmarking AlphaFold‐enabled molecular docking predictions for antibiotic discovery. Mol. Syst. Biol. 18, e11081 (2022).
Google Scholar
Bender, B. J. et al. A practical guide to large-scale docking. Nat. Protoc. 16, 4799–4832 (2021).
Google Scholar
Loyola-González, O. Black-box vs. white-box: understanding their advantages and weaknesses from a practical point of view. IEEE Access 7, 154096–154113 (2019).
Clinical and Laboratory Standards Institute. M100: Performance Standards for Antimicrobial Susceptibility Testing (2021).
Zhang, J. H., Chung, T. D. & Oldenburg, K. R. A simple statistical parameter for use in evaluation and validation of high throughput screening assays. J. Biomol. Screen. 4, 67–73 (1999).
Google Scholar
Kim, S. et al. PubChem substance and compound databases. Nucleic Acids Res. 44, D1202–D1213 (2016).
Google Scholar
Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100–D1107 (2012).
Google Scholar
Degtyarenko, K. et al. ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res. 36, D344–D350 (2008).
Google Scholar
Wishart, D. S. et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 34, D668–D672 (2006).
Google Scholar
Williams, A. J. et al. The CompTox Chemistry Dashboard: a community data resource for environmental chemistry. J. Cheminform. 9, 61 (2017).
Google Scholar
Bergerhoff, G., Hundt, R., Sievers, R. & Brown, I. D. The inorganic crystal structure data base. J. Chem. Inf. Comput. Sci. 23, 66–69 (1983).
Google Scholar
Belsky, A., Hellenbrandt, M., Karen, V. L. & Luksch, P. New developments in the Inorganic Crystal Structure Database (ICSD): accessibility in support of materials research and design. Acta Cryst. 58, 364–369 (2022).
Google Scholar
Kononova, O. et al. Text-mined dataset of inorganic materials synthesis recipes. Sci. Data 6, 203 (2019).
Google Scholar
Shivanyuk, A., Ryabukhin, S. V., Bogolubsky, A. V. & Tolmachev, A. Enamine REAL database: making chemical diversity real. Chem. Today 25, 58–59 (2007).
Google Scholar
Coley, C. W., Green, W. H. & Jensen, K. F. Machine learning in computer-aided synthesis planning. Acc. Chem. Res. 51, 1281–1289 (2018).
Google Scholar
Fink, T., Bruggesser, H. & Reymond, J.-L. Virtual exploration of the small-molecule chemical universe below 160 Daltons. Angew. Chem. Int. Ed. 44, 1504–1508 (2005).
Google Scholar
Fink, T. & Reymond, J.-L. Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery. J. Chem. Inf. Model. 47, 342–353 (2007).
Google Scholar
Blum, L. C. & Reymond, J.-L. 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. J. Am. Chem. Soc. 131, 8732–8733 (2009).
Google Scholar
Ruddigkeit, L., van Deursen, R., Blum, L. C. & Reymond, J.-L. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J. Chem. Inf. Model. 52, 2864–2875 (2012).
Google Scholar
Baell, J. B. & Holloway, G. A. New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J. Med. Chem. 53, 2719–2740 (2010).
Google Scholar
Brenk, R. et al. Lessons learnt from assembling screening libraries for drug discovery for neglected diseases. ChemMedChem 3, 435–444 (2008).
Google Scholar
Lipinski, C. A., Lombardo, F., Dominy, B. W. & Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug. Dis. Rev. 23, 3–25 (1997).
Google Scholar
Wong, F. et al. Supporting code for: discovery of a structural class of antibiotics with explainable deep learning. Zenodo (2023).
Samuel, A. L. Some studies in machine learning using the game of checkers. IBM J. 3, 211–229 (1959).
Google Scholar
Rosenblatt, F. The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65, 386–408 (1958).
Google Scholar
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).
Google Scholar
Krizhensky, A., Sutskever, I. & Hinton, G. E. In Advances in Neural Information Processing Systems 1106–1114 (2012).
Vaswani, A. et al. In Advances in Neural Information Processing Systems (2017).
Trinh, T. H., Wu, Y., Le, Q. V., He, H. & Luong, T. Solving olympiad geometry without human demonstrations. Nature 625, 476–482 (2024).
Google Scholar
Lundberg, S. M. and Lee, S.-I. In Proc. 31st International Conference on Neural Information Processing Systems 4768–4777 (2017).
Ribeiro, M. T., Singh, S. & Guestrin, C. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1135–1144 (2016).
Dai, H., Dai, B. & Song, L. In Proc. 33rd International Conference on Machine Learning 2702–2711 (2016).
Buterez, D., Janet, J. P., Kiddle, S. J., Oglic, D. & Lió, P. Transfer learning with graph neural networks for improved molecular property prediction in the multi-fidelity setting. Nat. Commun. 15, 1517 (2024).
Google Scholar
Xie, T., France-Lanord, A., Wang, Y., Shao-Horn, Y. & Grossman, J. Y. Graph dynamical networks for unsupervised learning of atomic scale dynamics in materials. Nat. Commun. 10, 2667 (2019).
Google Scholar
Yun, S., Jeong, M., Kim, R. Kang, J. & Kim, H. J. In 33rd Conference on Neural Information Processing Systems 11983–11993 (2019).
link