Reviewed by Alex SmithSep 1 2021
The U.S. Department of Energy’s (DOE) Argonne National Laboratory is taking measures to link artificial intelligence (AI) and sophisticated simulation workflows to gain better insights into biological observations and expedite drug discovery.
This has been done as a part of the ongoing campaign to disclose the inner workings of the SARS-CoV-2 virus.
Argonne joined hands with academic and commercial research collaborators to reach near real-time feedback between simulation and AI methods to comprehend how two proteins named nsp10 and nsp16 in the SARS-CoV-2 viral genome interact to help the virus replicate and evade the immune system of the host.
The researchers realized this milestone by coupling two different hardware platforms: ThetaGPU, an AI- and simulation-enabled extension of the Theta supercomputer, and Cerebras CS-1, a processor-packed silicon wafer deep learning accelerator, placed at the Argonne Leadership Computing Facility, a DOE Office of Science User Facility.
To achieve this potential, the researchers came up with Stream-AI-MD, an innovative application of the AI method known as deep learning to push adaptive molecular dynamics (MD) simulations in a streaming way. The data obtained from simulations are streamed from ThetaGPU onto the Cerebras CS-1 platform to concurrently examine the interaction of the two proteins.
This needs to be done at a scale that is unprecedented since the data generation and AI components have to run side-by-side. The idea is, if one machine is good at doing MD simulations and another is very good at AI, then why not couple the two to produce a much larger system that offers more throughput with AI.
Arvind Ramanathan, Member of the Research and Computational Biologist, Argonne National Laboratory
One of the AI methods employed by the researchers is known as a variational autoencoder, which learns to capture the most crucial data obtained from MD simulations. The researchers reduced the size of the simulation data sets to make them simpler for scientists to comprehend the dynamics of the simulation.
Running the deep learning component on Cerebras CS-1 helped the researchers determine the binding pockets — tiny spaces that may form during the development of the two proteins, and that can also be targeted for small-molecule drug design.
Eventually, such workflows will allow drug discoveries with the potential to treat both the SARS-CoV-2 virus and other diseases, when the physical processes underlying particular biological functions are characterized, stated Ramanathan. Also, while the study does not concentrate on vaccines at present, the development of highly complicated models could result in vaccine design.
This iterative workflow of supporting streaming AI and MD techniques on emerging hardware platforms will pave the way for advancing our knowledge of how proteins function. In the context of the SARS-CoV-2 virus, a fundamental understanding of molecular processes, such as the nsp16-nsp10 interaction, is important if we want to design drugs that can stop the virus in its path.
Arvind Ramanathan, Member of the Research and Computational Biologist, Argonne National Laboratory
This study was a collaborative work between Argonne and Cerebras Systems Inc. and was financially supported by the Exascale Computing Project, a collaborative measure of the U.S. DOE Office of Science and the National Nuclear Security Administration; and by the DOE Office of Science through the National Virtual Biotechnology Laboratory, a consortium of DOE national laboratories with a focus on response to COVID-19, with funding provided by the Coronavirus Aid, Relief and Economic Security (CARES) Act.
ThetaGPU was made possible with assistance from the CARES Act.
Journal Reference:
Brace, A., et al. (2021) Stream-AI-MD: streaming AI-driven adaptive molecular simulations for heterogeneous computing platforms. Proceedings of the Platform for Advanced Scientific Computing Conference. doi.org/10.1145/3468267.3470578.