The products of our company are only used for external research, not for clinical diagnosis
0086-27-65522046

Intelligent Protein Structure Prediction Platform:AlphaFold

Issuing time:2023-05-18 19:33


Intelligent Protein Structure Prediction PlatformAlphaFold


Link: https://colab.research.google.com/github/deepmind/alphafold/blob/main/notebooks/AlphaFold.ipynb

Note: If you have any problems during use AlphaFold, you can contact us.


AlphaFold, a newly launched technology under DeepMind, a subsidiary of Alphabet (Google's parent company), can predict and generate the 3D shape of proteins based solely on their genetic "code".

DeepMind states that AlphaFold is "an important milestone that demonstrates for the first time that artificial intelligence research can drive and accelerate scientific discovery".


Design Objective

The human body is capable of producing tens or even hundreds of thousands of proteins. Each protein is a chain of amino acids, and there are 20 different types of the latter. Proteins can twist and fold between amino acids, so a protein containing hundreds of amino acids can present an astonishing number (10 to the power of 300) of structural types.

The 3D shape of a protein depends on the number and type of amino acids it contains, and this shape also determines its function in the human body. For example, the folding of heart cell proteins can make any adrenaline in the bloodstream stick to them to accelerate heart rate. Antibodies in the immune system are folded into specific shapes to lock onto invaders. Almost every function of the body - from contracting muscles and sensing light to converting food into energy - is related to the shape and movement of proteins.

Usually, proteins assume the most energy-efficient shape, but they can become entangled or fold incorrectly, leading to diseases such as diabetes, Parkinson's, and Alzheimer's. If scientists can predict the shape of a protein based on its chemical composition, they can know what it does, how it can go wrong and cause harm, and design new proteins to combat diseases or perform other duties, such as breaking down plastic pollution in the environment.


Integrating AI

Due to the crucial importance of protein structure, scientists have been able to determine protein shapes using experimental techniques such as cryo-electron microscopy and nuclear magnetic resonance over the past fifty years. However, each method relies on a large amount of trial-and-error feedback, and each structure may require years of research and cost tens of thousands of dollars. Therefore, biologists have turned to AI methods to complete this difficult and tedious process.

Fortunately, due to the rapid decrease in gene sequencing costs, genomic data is abundant in the field. Therefore, in recent years, predictive problems based on genomic data have increasingly relied on deep learning methods. DeepMind is very concerned about this issue and has proposed AlphaFold, which has been submitted to the Critical Assessment of Structure Prediction (CASP).

DeepMind used AlphaFold to participate in CASP, a biennial protein folding Olympics that attracts research teams from around the world. The goal of the competition is to predict the structure of proteins based on lists of amino acids that are sent to the participating teams every few days over several months. The structures of these proteins have recently been decoded through laborious and expensive traditional methods but have not yet been made public. The team that submits the most accurate prediction will win.

Although it was the first time participating in the competition, AlphaFold ranked first among 98 competitors, accurately predicting the structure of 25 out of 43 types of proteins. In contrast, the second-place team only accurately predicted three types. It is worth noting that AlphaFold focuses on modeling the target shape from scratch and does not use previously solved proteins as templates. AlphaFold achieves high accuracy in predicting the physical properties of protein structures, and based on these predictions, can use two different methods to predict the complete protein structure.


Model Building

The models built by AlphaFold rely on deep neural networks that can predict protein properties from genomic sequences. According to DeepMind researchers, the protein properties predicted by neural networks primarily include (a) distance between amino acid pairs and (b) chemical bonds connecting these amino acids and the angles between them. The primary advancement of these methods is the improvement of commonly used techniques, which can estimate whether amino acid pairs are close to each other.

To build AlphaFold, DeepMind trained a neural network on thousands of known proteins until it could predict the 3D structure of a protein based solely on its amino acids. Given a new protein, AlphaFold uses the neural network to predict the distance between amino acid pairs and the angles between the chemical bonds connecting them. Then, AlphaFold adjusts the initial structure to find the most efficient arrangement. It took two weeks to predict the first protein structure, but now it can be done within a few hours.

Based on the two physical properties predicted by the neural network, DeepMind also trained another neural network to predict the independent distribution of distances between paired residues, which can combine into scores that estimate the accuracy of protein structure prediction. In addition, DeepMind trained another independent neural network that uses all the distances in the cluster to estimate the difference between the predicted structure and the actual structure.

These scoring functions can be used to explore the interior of proteins to find structures that match the prediction. DeepMind's first method is based on commonly used techniques in structural biology, repeatedly replacing a part of the overall protein structure with a new protein fragment. They trained a generative neural network to create new fragments, which were used to continuously improve the protein structure score.

The second method involves optimizing the score through gradient descent, resulting in highly accurate structures. Gradient optimization is applied to the entire protein chain rather than to individually folded segments before assembly, reducing the complexity of the prediction process.


A promising future

The success of the first foray into protein folding indicates that machine learning systems can integrate various sources of information to help scientists find creative solutions to complex problems quickly. Artificial intelligence has already mastered complex games through systems like AlphaGo and AlphaZero, and similarly, the future of using AI to tackle fundamental scientific problems is also promising.

Liam McGuffin, a researcher at the University of Reading, led the highest-scoring academic team in the competition. "DeepMind appears to have made greater progress, and I want to learn more about their methods. Our resources are limited, but we are still very competitive," he said.

"Predicting the shape of protein folding is critical, and it has a significant impact on solving many centuries-old problems. This ability can affect health, ecology, the environment, and basically solve any problem related to life systems."

"Many teams, including us, have been using machine learning-based methods for years, and advances in deep learning and AI seem to be increasingly important. I am optimistic about this field, and I think we will truly solve this problem in the 2020s." McGuffin said.

Hassabis also pointed out that there is still much work to be done. "We have not yet solved the problem of protein folding, we have only taken the first step. This is an extremely challenging problem, but we have a good system, and there are many ideas that have yet to be put into practice."

The early progress in protein folding is exciting, and it demonstrates the usefulness of artificial intelligence for scientific discoveries. Although there is still much work to be done before we can achieve quantifiable impacts in disease treatment, environmental management, and other areas, we know that the potential of artificial intelligence is enormous. With the efforts of a professional team dedicated to studying how machine learning can advance scientific development, we look forward to seeing technology make a difference.


Email
*
up
to get resources and special offers
About us
Sales:Sales@beyoscience.com
Service:Service@beyoscience.com
Tel:0086-27-65522046
Floor 5, Building B1, Optics Valley Biological City, No. 666 Gaoxin Avenue, East Lake New Technology Development Zone, Wuhan
24-Hour Service
Worldwide Delivery
One Stop Procurement
Free Sample