An explainable deep learning platform for molecular discovery

An explainable deep learning platform for molecular discovery

  • Wong, F. et al. Leveraging artificial intelligence in the fight against infectious diseases. Science 381, 164–170 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Wan, F., Wong, F., Collins, J. J. & de la Fuente-Nunez, C. Machine learning for antimicrobial peptide identification and design. Nat. Rev. Bioeng. 2, 392–407 (2024).

    Article 

    Google Scholar 

  • Silver, D. et al. Mastering the game of Go without human knowledge. Nature 550, 354–359 (2017).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Wong, F. et al. Discovery of a structural class of antibiotics with explainable deep learning. Nature 626, 177–185 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Bengio, Y., Lodi, A. & Prouvost, A. Machine learning for combinatorial optimization: a methodological tour d’horizon. Eur. J. Oper. Res. 290, 405–421 (2021).

    Article 

    Google Scholar 

  • Bohacek, R. S., McMartin, C. & Guida, W. C. The art and practice of structure-based drug design: a molecular modeling perspective. Med. Res. Rev. 16, 3–50 (1996).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Yang, X., Wang, Y., Byrne, R., Schneider, G. & Yang, S. Concepts of artificial intelligence for computer-assisted drug discovery. Chem. Rev. 119, 10520–10594 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Burbidge, R., Trotter, M., Buxton, B. & Holden, S. Drug design by machine learning: support vector machines for pharmaceutical data analysis. Comput. Chem. 26, 5–14 (2001).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Warmuth, M. K. et al. Active learning with support vector machines in the drug discovery process. J. Chem. Inf. Comput. Sci. 43, 667–673 (2003).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Zernov, V. V., Balakin, K. V., Ivaschenko, A. A., Savchuk, N. P. & Pletnev, I. V. Drug discovery using support vector machines. The case studies of drug-likeness, agrochemical-likeness, and enzyme inhibition predictions. J. Chem. Inf. Comput. Sci. 43, 2048–2056 (2003).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Sadybekov, A. V. & Katritch, V. Computational approaches streamlining drug discovery. Nature 616, 673–685 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Yang, K. et al. Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 59, 3370–3388 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180, 688–702 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Liu, G. et al. Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii. Nat. Chem. Biol. 19, 1342–1350 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Zheng, E. J. et al. Discovery of antibiotics that selectively kill metabolically dormant bacteria. Cell. Chem. Biol. 31, 712–728.e9 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Melo, M. C. R., Maasch, J. R. M. A. & de la Fuente-Nunez, C. Accelerating antibiotic discovery through artificial intelligence. Commun. Biol. 4, 1050 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Cesaro, A., Bagheri, M., Torres, M., Wan, F. & de la Fuente-Nunez, C. Deep learning tools to accelerate antibiotic discovery. Expert Opin. Drug Discov. 18, 1245–1257 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Krishnan, S. R. et al. De novo design of anti-tuberculosis agents using a structure-based deep learning method. J. Mol. Graph. Model. 118, 108361 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Wong, F. et al. Discovering small-molecule senolytics with deep neural networks. Nat. Aging 3, 734–750 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Jin, W. et al. Deep learning identifies synergistic drug combinations for treating COVID-19. Proc. Natl Acad. Sci. USA 118, e2105070118 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Preuer, K. et al. DeepSynergy: predicting anti-cancer drug synergy with deep learning. Bioinformatics 34, 1538–1546 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Wan, F., Kontogiorgos-Heintz, D. & de la Fuente-Nunez, C. Deep generative models for peptide design. Digit. Discov. 1, 195–208 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • De Cao, N. & Kipf, T. MolGAN: an implicit generative model for small molecular graphs. In ICML 2018 Workshop on Theoretical Foundations and Applications of Deep Generative Models (2018).

  • Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268–276 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Jin, W., Barzilay, R. & Jaakkola, T. In Proc. 35th International Conference on Machine Learning 2323–2332 (2018).

  • Blaschke, T. et al. REINVENT 2.0: an AI tool for de novo drug design. J. Chem. Inf. Model. 60, 5918–5922 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Zhou, Z., Kearnes, S., Li, L., Zare, R. N. & Riley, P. Optimization of molecules via deep reinforcement learning. Sci. Rep. 9, 10752 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Zeng, X. et al. Deep generative molecular design reshapes drug discovery. Cell Rep. Med. 3, 100794 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37, 1038–1040 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Ying, R., Bourgeois, D., You, J., Zitnik, M. & Leskovic, J. GNNExplainer: generating explanations for graph neural networks. Adv. Neural Inf. Process. Syst. 32, 9240–9251 (2019).

    PubMed 
    PubMed Central 

    Google Scholar 

  • Jiménez-Luna, J., Grisoni, F. & Schneider, G. Drug discovery with explainable artificial intelligence. Nat. Mach. Intell. 2, 573–584 (2020).

    Article 

    Google Scholar 

  • Yuan, H., Yu, H., Gui, S. & Ji, S. Explainability in graph neural networks: a taxonomic survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 5782–5799 (2023).

    PubMed 

    Google Scholar 

  • Yuan, H., Yu., H., Wang, J., Li, K. & Ji, S. In Proc. 38th International Conference on Machine Learning 12241–12252 (2021).

  • Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Gilmer, J. et al. In Proc. 34th International Conference on Machine Learning 1263–1272 (2017).

  • Wu, Z. et al. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32, 2162–2388 (2021).

    Article 

    Google Scholar 

  • Zhou, J. et al. Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020).

    Article 

    Google Scholar 

  • Reiser, P. et al. Graph neural networks for materials science and chemistry. Commun. Mater. 3, 93 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Heid, E. & Green, W. H. Machine learning of reaction properties via learned representations of the condensed graph of reaction. J. Chem. Inf. Model. 62, 2101–2110 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Jin, W., Barzilay, R. & Jaakkola, T. In Proc. 37th International Conference on Machine Learning 4849–4859 (2020).

  • Heid, E. et al. Chemprop: a machine learning package for chemical property prediction. J. Chem. Inf. Model. 64, 9–17 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo Tree Search. In Computers and Games (CG 2006). Lecture Notes in Computer Science (eds van den Herik, H. J. et al.) 4630, 72–83 (Springer, 2007)

  • Tingle, B. I. et al. ZINC-22—a free multi-billion-scale database of tangible compounds for ligand discovery. J. Chem. Inf. Model. 63, 1166–1176 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Verheij, H. J. Leadlikeness and structural diversity of synthetic screening libraries. Mol. Divers. 10, 377–388 (2006).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Krier, M., Bret, G. & Rognan, D. Assessing the scaffold diversity of screening libraries. J. Chem. Inf. Model. 46, 512–524 (2006).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Swanson, K. et al. ADMET-AI: a machine learning ADMET platform for evaluation of large-scale chemical libraries. Bioinformatics 40, btae416 (2024).

  • McGill, C., Forsuelo, M., Guan, Y. & Green, W. H. Predicting infrared spectra with message passing neural networks. J. Chem. Inf. Model. 61, 2594–2609 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Swinney, D. C. & Anthony, J. How were new medicines discovered. Nat. Rev. Drug Discov. 10, 507–519 (2011).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Swinney, D. C. Phenotypic vs. target-based drug discovery for first-in-class medicines. Clin. Pharmacol. Ther. 93, 299–301 (2013).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Moffat, J. G., Vincent, F., Lee, J. A., Eder, J. & Prunotto, M. Opportunities and challenges in phenotypic drug discovery: an industry perspective. Nat. Rev. Drug Discov. 16, 531–543 (2017).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Muratov, E. N. et al. QSAR without borders. Chem. Soc. Rev. 49, 3525–3564 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Wong, F. et al. Benchmarking AlphaFold‐enabled molecular docking predictions for antibiotic discovery. Mol. Syst. Biol. 18, e11081 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Bender, B. J. et al. A practical guide to large-scale docking. Nat. Protoc. 16, 4799–4832 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Loyola-González, O. Black-box vs. white-box: understanding their advantages and weaknesses from a practical point of view. IEEE Access 7, 154096–154113 (2019).

  • Clinical and Laboratory Standards Institute. M100: Performance Standards for Antimicrobial Susceptibility Testing (2021).

  • Zhang, J. H., Chung, T. D. & Oldenburg, K. R. A simple statistical parameter for use in evaluation and validation of high throughput screening assays. J. Biomol. Screen. 4, 67–73 (1999).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Kim, S. et al. PubChem substance and compound databases. Nucleic Acids Res. 44, D1202–D1213 (2016).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100–D1107 (2012).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Degtyarenko, K. et al. ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res. 36, D344–D350 (2008).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Wishart, D. S. et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 34, D668–D672 (2006).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Williams, A. J. et al. The CompTox Chemistry Dashboard: a community data resource for environmental chemistry. J. Cheminform. 9, 61 (2017).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Bergerhoff, G., Hundt, R., Sievers, R. & Brown, I. D. The inorganic crystal structure data base. J. Chem. Inf. Comput. Sci. 23, 66–69 (1983).

    Article 
    CAS 

    Google Scholar 

  • Belsky, A., Hellenbrandt, M., Karen, V. L. & Luksch, P. New developments in the Inorganic Crystal Structure Database (ICSD): accessibility in support of materials research and design. Acta Cryst. 58, 364–369 (2022).

    Article 

    Google Scholar 

  • Kononova, O. et al. Text-mined dataset of inorganic materials synthesis recipes. Sci. Data 6, 203 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Shivanyuk, A., Ryabukhin, S. V., Bogolubsky, A. V. & Tolmachev, A. Enamine REAL database: making chemical diversity real. Chem. Today 25, 58–59 (2007).

    CAS 

    Google Scholar 

  • Coley, C. W., Green, W. H. & Jensen, K. F. Machine learning in computer-aided synthesis planning. Acc. Chem. Res. 51, 1281–1289 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Fink, T., Bruggesser, H. & Reymond, J.-L. Virtual exploration of the small-molecule chemical universe below 160 Daltons. Angew. Chem. Int. Ed. 44, 1504–1508 (2005).

    Article 
    CAS 

    Google Scholar 

  • Fink, T. & Reymond, J.-L. Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery. J. Chem. Inf. Model. 47, 342–353 (2007).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Blum, L. C. & Reymond, J.-L. 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. J. Am. Chem. Soc. 131, 8732–8733 (2009).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Ruddigkeit, L., van Deursen, R., Blum, L. C. & Reymond, J.-L. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J. Chem. Inf. Model. 52, 2864–2875 (2012).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Baell, J. B. & Holloway, G. A. New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J. Med. Chem. 53, 2719–2740 (2010).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Brenk, R. et al. Lessons learnt from assembling screening libraries for drug discovery for neglected diseases. ChemMedChem 3, 435–444 (2008).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Lipinski, C. A., Lombardo, F., Dominy, B. W. & Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug. Dis. Rev. 23, 3–25 (1997).

    Article 
    CAS 

    Google Scholar 

  • Wong, F. et al. Supporting code for: discovery of a structural class of antibiotics with explainable deep learning. Zenodo (2023).

  • Samuel, A. L. Some studies in machine learning using the game of checkers. IBM J. 3, 211–229 (1959).

    Article 

    Google Scholar 

  • Rosenblatt, F. The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65, 386–408 (1958).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).

    Article 

    Google Scholar 

  • Krizhensky, A., Sutskever, I. & Hinton, G. E. In Advances in Neural Information Processing Systems 1106–1114 (2012).

  • Vaswani, A. et al. In Advances in Neural Information Processing Systems (2017).

  • Trinh, T. H., Wu, Y., Le, Q. V., He, H. & Luong, T. Solving olympiad geometry without human demonstrations. Nature 625, 476–482 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Lundberg, S. M. and Lee, S.-I. In Proc. 31st International Conference on Neural Information Processing Systems 4768–4777 (2017).

  • Ribeiro, M. T., Singh, S. & Guestrin, C. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1135–1144 (2016).

  • Dai, H., Dai, B. & Song, L. In Proc. 33rd International Conference on Machine Learning 2702–2711 (2016).

  • Buterez, D., Janet, J. P., Kiddle, S. J., Oglic, D. & Lió, P. Transfer learning with graph neural networks for improved molecular property prediction in the multi-fidelity setting. Nat. Commun. 15, 1517 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Xie, T., France-Lanord, A., Wang, Y., Shao-Horn, Y. & Grossman, J. Y. Graph dynamical networks for unsupervised learning of atomic scale dynamics in materials. Nat. Commun. 10, 2667 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Yun, S., Jeong, M., Kim, R. Kang, J. & Kim, H. J. In 33rd Conference on Neural Information Processing Systems 11983–11993 (2019).

  • link

    Leave a Reply

    Your email address will not be published. Required fields are marked *