June 27th, 2024

AI Revolutionized Protein Science, but Didn't End It

Artificial intelligence, exemplified by AlphaFold2 and its successor AlphaFold3, revolutionized protein science by predicting structures accurately. AI complements but doesn't replace traditional methods, emphasizing collaboration for deeper insights.

Read original articleLink Icon
AI Revolutionized Protein Science, but Didn't End It

Artificial intelligence (AI) made a significant impact on protein science when Google's AlphaFold2 presented a breakthrough in predicting protein structures with over 90% accuracy, revolutionizing the field. This success shifted the way biologists study proteins, emphasizing the power of AI in biology. While AlphaFold2 has inspired new algorithms and biotech companies, it has not replaced biological experiments but highlighted their importance. The successor, AlphaFold3, introduced in May 2024, expanded predictions to include protein structures in combination with other molecules like DNA or RNA. Despite AI's advancements, significant gaps remain, such as simulating protein dynamics over time or within cellular contexts. The protein folding problem, a fundamental question in biology, has intrigued scientists for decades, with AI offering a new perspective but not a complete solution. The story of AlphaFold's impact on protein science underscores the ongoing collaboration between AI and traditional biological research methods to advance understanding in this critical field.

Related

Six things to keep in mind while reading biology ML papers

Six things to keep in mind while reading biology ML papers

The article outlines considerations for reading biology machine learning papers, cautioning against blindly accepting results, emphasizing critical evaluation, understanding limitations, and recognizing biases. It promotes a nuanced and informed reading approach.

Are AlphaFold's new results a miracle?

Are AlphaFold's new results a miracle?

AlphaFold 3 by DeepMind excels in predicting molecule-protein binding, surpassing AutoDock Vina. Concerns about data redundancy, generalization, and molecular interaction understanding prompt scrutiny for drug discovery reliability.

Official PyTorch Documentary: Powering the AI Revolution [video]

Official PyTorch Documentary: Powering the AI Revolution [video]

The YouTube video discusses AI technology advancements, mentioning Torch, Theano, Cafea, and the transition from Facebook AI Research to Meta AI Research. It covers Cafe 2 for mobile apps, TensorFlow's 2015 debut, and a Python machine learning library launched in January 2017.

How AI Revolutionized Protein Science, but Didn't End It

How AI Revolutionized Protein Science, but Didn't End It

Artificial intelligence, exemplified by AlphaFold2 and AlphaFold3, revolutionized protein science by accurately predicting protein structures. Despite advancements, AI complements rather than replaces biological experiments, highlighting the complexity of simulating protein dynamics.

AI can beat real university students in exams, study suggests

AI can beat real university students in exams, study suggests

A study from the University of Reading reveals AI outperforms real students in exams. AI-generated answers scored higher, raising concerns about cheating. Researchers urge educators to address AI's impact on assessments.

Link Icon 5 comments
By @trivexwe - 5 months
Weird article.

It mentions multiple times that ~”the protein folding problem is solved” as well as multiple instances of ~”but there are limitations to this technique and it is often missing crucial details”.

It really is difficult to conceptualize these highly nonlinear problem spaces, like protein folding, until you attempt to work with them.

Many in software development have an intuitive understanding of the difficulty evidenced in the community’s ~“the last 10% took 100% of the time” meme.

Even in a nonlinear problem spaces you have “trivial” solutions.

Terry Tao famously coauthored a paper finding arithmetic progressions for generating sequences of primes.[1] The sequences found are “trivial” in terms of “solving the prime sequence problem” in that they are sparse, the sequences are finite, and progressions lack a method of find more.

These machine learning tools are by design approximation engines. I’m unsure of any results that prove one way or the other that it is possible to pass a bound of approximation that provides exact solutions. (think, an approximate solution that only fails to provide exact solutions for solutions that are trivial using a different method, I think a lot of work I p-adics is motivated similarly)

I feel these machine learning techniques are expanding the definition of “trivial solutions” to include those capable of being solved by their convoluted methods (back prop, etc). Since this new subset of the space that can be labeled “solved” appear more complex than known trivial solutions people assume the whole space must be known, and this is where the difficult conceptualization rears its influence.

Protein folding is still an unsolved problem, and I’m dubious of the notion machine learning will ever solve it, but hopefully we get some helpful science out of it.

[1] https://en.m.wikipedia.org/w/index.php?title=Green%E2%80%93T...

By @ak_111 - 5 months
"...Some cell biologists and biochemists who used to work with structural biologists have replaced them with AlphaFold2 — and take its predictions as truth. Sometimes scientists publish papers featuring protein structures that, to any structural biologist, are obviously incorrect, Perrakis said. “And they say: ‘Well, that’s the AlphaFold structure.’”"

It is amazing that this happens. I am not naive about academic standards, but if something is clearly wrong and used in a paper (especially one with consequences on medical health) then it should be quite easy to name-and-shame until the editors of the journal force the authors to make a redaction or correction if the authors don't do it themselves. Otherwise people should start name-and-shame the journal and its reputation should sink.

Also I am curious if there are already lists of known incorrect predictions by Alphafold, shouldn't this be published and alphafold's database tag such predictions accordingly to notify users that these particular predictions are proven to be wrong.

By @DrScientist - 5 months
It is/was a brilliant piece of work ( Nobel prize level ) - however I think the impact is over-hyped - as somebody who has experimentally solved a protein structure - I can tell you knowing the structure doesn't necessarily help you understand the biology - not every structure is as functionally obvious as the structure of DNA for example.

In terms of drug discovery - even assuming the models are as good as experimental structures, you only get the same benefit as experimental structures - which have helped small molecule drug discovery - but I would argue not transformed it. All the existing challenges with structure based drug design remain.

BTW while Alphafold 2 was a big step forward from Alphafold(1) - it wasn't a complete shock as Alphafold 1 had already topped the charts in a previous competition a couple of years earlier.

By @nybsjytm - 5 months
Great article, covers well both the achievements and the shortcomings. It's crazy how many people write about these kinds of AI developments while completely skipping over anything like the following:

> The “good news is that when AlphaFold thinks that it’s right, it often is very right,” Adams said. “When it thinks it’s not right, it generally isn’t.” However, in about 10% of the instances in which AlphaFold2 was “very confident” about its prediction (a score of at least 90 out of 100 on the confidence scale), it shouldn’t have been, he reported: The predictions didn’t match what was seen experimentally.

> That the AI system seems to have some self-skepticism may inspire an overreliance on its conclusions. Most biologists see AlphaFold2 for what it is: a prediction tool. But others are taking it too far. Some cell biologists and biochemists who used to work with structural biologists have replaced them with AlphaFold2 — and take its predictions as truth. Sometimes scientists publish papers featuring protein structures that, to any structural biologist, are obviously incorrect, Perrakis said. “And they say: ‘Well, that’s the AlphaFold structure.’” ...

> Jones has heard of scientists struggling to get funding to determine structures computationally. “The general perception is that DeepMind did it, you know, and why are you still doing it?” Jones said. But that work is still necessary, he argues, because AlphaFold2 is fallible.

> “There are very large gaps,” Jones said. “There are things that it can’t do quite clearly.”