July 13th, 2024

The Illustrated AlphaFold

The article discusses AlphaFold3's architecture for predicting protein structures, including Input Preparation, Representation Learning, and Structure Prediction. It highlights improvements like predicting complexed proteins and enriching representations with MSA and templates.

Read original article

The article provides a detailed visual walkthrough of the AlphaFold3 architecture, focusing on how the model works to predict protein structures from sequences alone. Unlike previous versions, AlphaFold3 can predict the structure of proteins complexed with other molecules. The model consists of three main sections: Input Preparation, Representation Learning, and Structure Prediction. Input Preparation involves converting user-provided sequences into numerical tensors and retrieving similar molecules for embedding. Representation Learning updates these representations using various forms of attention. Structure Prediction utilizes these representations to predict the structure using conditional diffusion. The article also explains tokenization, retrieval of Multiple Sequence Alignments (MSA) and templates, and the representation of templates using distograms. The inclusion of MSA and templates enriches protein representations and informs structure predictions. The retrieval process involves searching for similar sequences and structures without training, enhancing the model's accuracy. Overall, the article offers a comprehensive understanding of AlphaFold3's complex architecture for protein structure prediction.

Are AlphaFold's new results a miracle?

AlphaFold 3 by DeepMind excels in predicting molecule-protein binding, surpassing AutoDock Vina. Concerns about data redundancy, generalization, and molecular interaction understanding prompt scrutiny for drug discovery reliability.

ESM3, EsmGFP, and EvolutionaryScale

EvolutionaryScale introduces ESM3, a language model simulating 500 million years of evolution. ESM3 designs proteins with atomic precision, including esmGFP, a novel fluorescent protein, showcasing its potential for innovative protein engineering.

How AI Revolutionized Protein Science, but Didn't End It

Artificial intelligence, exemplified by AlphaFold2 and AlphaFold3, revolutionized protein science by accurately predicting protein structures. Despite advancements, AI complements rather than replaces biological experiments, highlighting the complexity of simulating protein dynamics.

AI Revolutionized Protein Science, but Didn't End It

Artificial intelligence, exemplified by AlphaFold2 and its successor AlphaFold3, revolutionized protein science by predicting structures accurately. AI complements but doesn't replace traditional methods, emphasizing collaboration for deeper insights.

Ex-Meta scientists debut gigantic AI protein design model

EvolutionaryScale introduces ESM3, a powerful AI protein design model trained on billions of sequences. Secured $142 million funding for drug development. Addresses concerns about AI-designed proteins. Researchers anticipate its impact.

5 comments

By @tomohelix - 9 months

I consider this a glimpse into how neural networks and "AI"-like techs would be implemented in the future. Lots of engineering, lots of clever manipulations of known techniques woven together with a powerful, well trained, model, at the center.

Right now I think stuff like chatgpt is only at the first step of making that foundational model that can generalize and process data. There isn't a lot of work going into processing the inputs into something the model can best understand (not at the tokenizer level, even before that). We have a basic field about this i.e. prompt engineers but nothing as sophisticated as Alphafold exists for natural language or images yet.

People are stacking LLMs together and putting system prompts in to assist this input processing. Maybe when we have some more complex systems in place, we can see something resembling a real AGI.

By @great_tankard - 9 months

This is an awesome writeup that really helped me understand what's going on under the hood. I didn't know, for example, that for the limited number of PTMs AF3 can handle it has to treat every single atom, including those of the main and side chain, as an individual token (presumably because PTMs are very underrepresented in the PDB?)

Thank you for translating the paper into something this structural biologist can grasp.

By @inciampati - 9 months

It's so, so complex! I confess I had a sense of this but had no idea. We don't even hear which MSA algorithm is used to align the protein sequences.

By @mk_stjames - 9 months

I have no prior knowledge on protein folding but nevertheless I enjoyed (attempting) to read through this. It's interesting to see the complexity in techniques used in comparison to a lot of other ML projects today.

By @joelS - 9 months

This is an amazing writeup, thank you. looking forward to going through it in more detail.

The Illustrated AlphaFold

Related

Are AlphaFold's new results a miracle?

ESM3, EsmGFP, and EvolutionaryScale

How AI Revolutionized Protein Science, but Didn't End It

AI Revolutionized Protein Science, but Didn't End It

Ex-Meta scientists debut gigantic AI protein design model

Related

Are AlphaFold's new results a miracle?

ESM3, EsmGFP, and EvolutionaryScale

How AI Revolutionized Protein Science, but Didn't End It

AI Revolutionized Protein Science, but Didn't End It

Ex-Meta scientists debut gigantic AI protein design model