|
|
|
|
|
by kir-gadjello
1194 days ago
|
|
As the discussion of GPT-4 heats up, the absence of details on its technical implementation becomes only more glaring. As an engineer, I have not learned anything applicable I haven't known yesterday from the newest OpenAI publication! I have been investigating issues of LLM training and inference for quite some time, and have developed a number of hypotheses about future SoTA models, which I believe very likely apply to GPT-4. |
|
For example, I think Google's paper "Sparse is enough for scaling transformers" was very underrated, as it provided more than an order of magnitude improvement for inference economy, and it included one OpenAI researcher among authors.