| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kir-gadjello 1194 days ago
	As the discussion of GPT-4 heats up, the absence of details on its technical implementation becomes only more glaring. As an engineer, I have not learned anything applicable I haven't known yesterday from the newest OpenAI publication! I have been investigating issues of LLM training and inference for quite some time, and have developed a number of hypotheses about future SoTA models, which I believe very likely apply to GPT-4.

1 comments

kir-gadjello 1194 days ago

If you have questions about my rationale for this or that technique included in the list, please, ask!

For example, I think Google's paper "Sparse is enough for scaling transformers" was very underrated, as it provided more than an order of magnitude improvement for inference economy, and it included one OpenAI researcher among authors.

link