| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Version467 1180 days ago

The recent post from Stephen Wolfram[1] is pretty good as an introduction, but I haven't seen any super comprehensive material that tries do dissect all the interesting behaviour we see in the really big llms. For that just reading the relevant papers themselves has been pretty fruitful for me. Some of them are actually very well written, even if you aren't used to reading scientific papers. I can recommend the Sparks of AGI paper[2] and the toolformer paper[3].

Obviously there's much more out there, those three things are a pretty good read.

[1]: https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

[2]: https://arxiv.org/abs/2303.12712

[3]: https://arxiv.org/abs/2302.04761