AI solves 50-year-old science problem in ‘stunning advance’ that could dramatically change how we fight diseases, researchers say

A 50-year-old science problem has been solved and could allow for dramatic changes in the fight against diseases, researchers say.

For years, scientists have been struggling with the problem of “protein folding” – mapping the three-dimensional shapes of the proteins that are responsible for diseases from cancer to Covid-19.

Google’s Deepmind claims to have created an artificially intelligent program called “AlphaFold” that is able to solve those problems in a matter of days.

If it works, the solution has come “decades” before it was expected, according to experts, and could have transformative effects in the way diseases are treated.

There are 200 million known proteins at present but only a fraction have actually been unfolded to fully understand what they do and how they work. Even those that have been successfully understood often rely on expensive and time-intensive techniques, with scientists spending years unfolding each structure and relying on equipment that can cost many millions of dollars.

DeepMind worked on the AI project with the 14th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP14), a group of scientists who have been looking into the matter since 1994.

“Proteins are extremely complicated molecules, and their precise three-dimensional structure is key to the many roles they perform, for example the insulin that regulates sugar levels in our blood and the antibodies that help us fight infections,” Dr John Moult, chair of CASP14, said.

“Even tiny rearrangements of these vital molecules can have catastrophic effects on our health, so one of the most efficient ways to understand disease and find new treatments is to study the proteins involved.

“There are tens of thousands of human proteins and many billions in other species, including bacteria and viruses, but working out the shape of just one requires expensive equipment and can take years.”

During the latest test, DeepMind said AlphaFold determined the shape of around two-thirds of the proteins with accuracy comparable to laboratory experiments. The results of those tests have been published online, so that they can be scrutinised by external scientists.

Now researchers behind the project say there is still more work to be done, including figuring out how multiple proteins form complexes and how they interact with DNA.

DeepMind is planning to submit a paper detailing its system to a peer-reviewed journal to be scrutinised by the wider scientific community.

Professor Venki Ramakrishnan, Nobel Laureate and president of the Royal Society, said: “This computational work represents a stunning advance on the protein-folding problem, a 50-year-old grand challenge in biology.

“It has occurred decades before many people in the field would have predicted.

“It will be exciting to see the many ways in which it will fundamentally change biological research.”

DeepMind noted that among other things, the prediction of protein structures could be an important part of responses to future pandemics, and that it had already used its machine learning technology on the protein structures of the SARS-CoV-2 virus, which causes Covid-19.

somethingstrang on November 30th, 2020 at 19:15 UTC »

This is from DeepMind which is the same team that made AlphaGo. Nature has already made a comment on it. It’ll likely get peer reviewed successfully

Oztotl on November 30th, 2020 at 18:57 UTC »

I remember when my roommate bought a ps3 like 16 years ago. We installed a protein folding app that was supposed to use the ps3 as a node for computing. We used to leave it on for days at a time. Wonder if we helped at all lol.

aqlu on November 30th, 2020 at 17:15 UTC »

Long & short of it

A 50-year-old science problem has been solved and could allow for dramatic changes in the fight against diseases, researchers say.

For years, scientists have been struggling with the problem of “protein folding” – mapping the three-dimensional shapes of the proteins that are responsible for diseases from cancer to Covid-19.

Google’s Deepmind claims to have created an artificially intelligent program called “AlphaFold” that is able to solve those problems in a matter of days.

If it works, the solution has come “decades” before it was expected, according to experts, and could have transformative effects in the way diseases are treated.

E: For those interested, /u/mehblah666 wrote a lengthy response to the article.

All right here I am. I recently got my PhD in protein structural biology, so I hope I can provide a little insight here.

The thing is what AlphaFold does at its core is more or less what several computational structural prediction models have already done. That is to say it essentially shakes up a protein sequence and helps fit it using input from evolutionarily related sequences (this can be calculated mathematically, and the basic underlying assumption is that related sequences have similar structures). The accuracy of alphafold in their blinded studies is very very impressive, but it does suggest that the algorithm is somewhat limited in that you need a fairly significant knowledge base to get an accurate fold, which itself (like any structural model, whether computational determined or determined using an experimental method such as X-ray Crystallography or Cryo-EM) needs to biochemically be validated. Where I am very skeptical is whether this can be used to give an accurate fold of a completely novel sequence, one that is unrelated to other known or structurally characterized proteins. There are many many such sequences and they have long been targets of study for biologists. If AlphaFold can do that, I’d argue it would be more of the breakthrough that Google advertises it as. This problem has been the real goal of these protein folding programs, or to put it more concisely: can we predict the 3D fold of any given amino acid sequence, without prior knowledge? As it stands now, it’s been shown primarily as a way to give insight into the possible structures of specific versions of different proteins (which again seems to be very accurate), and this has tremendous value across biology, but Google is trying to sell here, and it’s not uncommon for that to lead to a bit of exaggeration.

I hope this helped. I’m happy to clarify any points here! I admittedly wrote this a bit off the cuff.

E#2: Additional reading, courtesy /u/Lord_Nivloc