Hacker News new | ask | show | jobs
by jonmc12 1117 days ago
Some of the lower-hanging fruit in chemistry data was addressed by earlier versions of deep learning, like AlphaFold. It seems the nature of the domain is such that the language is less ambiguous than most natural language. Does anyone have a perspective on the apparent advantages of mapping chemistry interactions to latent space models for LLM training?
1 comments

Any arbitrary but meaningful seq2seq data can possibly be modelled by a transformer.

Language models can be trained to generate novel functioning protein structures (by training on protein functions and their corresponding sequences), bypassing any sort of folding process entirely.

https://www.nature.com/articles/s41587-022-01618-2

May as well try.