Hacker News new | ask | show | jobs
by throwaway0x7E6 1252 days ago
we have no way to evaluate that. OpenAI products are severely lobotomized. Microdoft fears another Tay
1 comments

The article is quite literally a[1] review of exactly how we might evaluate that, with evidence of people who got results.

[1] To be fair, way to wordy and blowhardated version. Alexander seems to be getting worse and not better. The core ideas here could be presented in about a third the space.

Like ChatGPT, and like 3-hour podcasts, he gets longer over time because he's trained on RLHF from his readers.
Ah, infotainment. Consumers love it, but the same is true of sugar and heroin. I write and help produce a podcast and we are constantly unhappy about the difference between what we think is important vs what people want to hear.
> The article is quite literally a[1] review of exactly how we might evaluate that, with evidence of people who got results.

the procedure seems more like a way to evaluate Anthropic-based AIs with different numbers of parameters, rather than a cross-the-board evaluation of fine-tuned chat AIs, and then those results are extrapolated to somehow say something about all AIs that are built similarly.

unless i'm missing some key here, it feels like a rather loose way to derive experimental data from the landscape.

Makes me want to feed the article into some sort of program that could rewrite it to be more succinct..