|
|
|
|
|
by mxwsn
784 days ago
|
|
It's not an English LLM, but a "protein" language model, where tokens represent amino acids or nucleotides. Learning a transformer language model on such data simply learns a distribution over sequences of tokens. It's a fine approach conceptually that in many ways is the "right" way or most elegant method, and not a stretch at all. |
|