Reviewed by Lexie CornerNov 5 2024
Researchers from Linköping University enhanced the AI tool AlphaFold to predict the shape of extremely large and complex protein structures, according to a study published in Nature Communications. Additionally, the researchers have successfully integrated experimental data into the tool, marking a significant step toward more effective protein development for medical drugs and other applications.
Proteins play a crucial role in regulating cell functions across all living organisms. They are involved in a wide range of bodily processes, from muscle regulation and hair formation to oxygen transport in the blood and the breakdown of food. Proteins are also present outside the body in products such as drugs and detergents.
Proteins are large molecules composed of 20 different amino acids that bond together in long chains, much like beads on a necklace. These chains can vary in length from 50 to several thousand amino acids, resulting in billions of possible combinations that determine the protein's three-dimensional shape. The unique folding of the protein chain dictates its specific functions.
The protein’s three-dimensional shape is thus determined by the billions of possible combinations that result from this. Different proteins have entirely different functions depending on how the protein chain is folded, or its shape.
For more than 50 years, researchers have attempted to predict and design various protein structures in an effort to better understand bodily functions, investigate different diseases, and develop new drugs. This process has been labor-intensive, costly, and time-consuming, often requiring significant manual effort.
Breakthrough with AI
However, in 2020, the company DeepMind released AlphaFold, an open-source program that utilizes artificial intelligence based on neural networks to accurately predict protein functions by forecasting how they will fold. This groundbreaking innovation earned the 2024 Nobel Prize in Chemistry.
Despite its advancements, the program has limitations. For instance, it struggles to make inferences from experimental or incomplete data and has difficulty predicting very large protein compounds.
To address these shortcomings, researchers at Linköping University have since enhanced AlphaFold. The modified program, known as AF_unmasked, can now predict extremely large and complex protein structures while also incorporating information from experimental data and handling incomplete datasets.
We are giving a new type of input to AlphaFold. The idea is to get the whole picture, both from experiments and neural networks, making it possible to build larger structures. But you can also have a draft of a structure that you feed into AlphaFold and get a relatively accurate result.
Claudio Mirabello, Principal Research Engineer, Docent, Linköping University
Refine Experiments
By providing recommendations on protein design, AF_unmasked aims to assist researchers in enhancing their experimental approaches. This represents a significant step toward developing new types of protein drugs and gaining deeper insights into protein functionality.
Since the 1970s, researchers worldwide have been compiling information on the structures of approximately 200,000 distinct proteins, which contributed to the AlphaFold breakthrough. AlphaFold was trained using data from this extensive database. The advancement of supercomputer technology, particularly the use of GPUs for complex computations, ultimately enabled the large-scale implementation of this technology.
Linköping University bioinformatics professor Björn Wallner has collaborated with one of the three Nobel laureates involved in this research.
The possibilities for protein design are endless; only the imagination sets limits. It’s possible to develop proteins for use both inside and outside the body. You always have to find new, more difficult problems when you have solved the old ones. And within our field, finding problems is no problem.
Björn Wallner, Professor, Linköping University
An Idea from LiU
He created a precursor to AlphaFold with Claudio Mirabello, which inspired Deepmind’s development of the tool. Thanks to the resources of the Google-owned company, they were able to create what is now an essential tool for protein scientists around the world.
Mirabello added, “AlphaFold wasn’t the first tool to use deep neural networks to solve the problem. In fact, one of the most important characteristics of AlphaFold is that it encodes the evolutionary history of a protein inside the neural network, an idea that actually originated here at LiU and was published by Björn and me in 2019. So, you could say that AlphaFold was based on our idea, and now we are building on AlphaFold.”
The study was primarily funded by SciLife Lab, the Knut and Alice Wallenberg Foundation, and the Swedish Foundation for Strategic Research. The calculations were carried out on the Tetralith and Berzelius supercomputers at Linköping University’s National Supercomputer Centre.
Journal Reference:
Mirabello, C. et. al. (2024) Unmasking AlphaFold to integrate experiments and predictions in multimeric complexes. Nature Communications. doi.org/10.1038/s41467-024-52951-w