| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kmacdough 455 days ago
	Keep in mind this release was never intended to prove superiority. Rather, it shows an alternative structure with some promising performance characteristics. More work needs to be done to show real application, but this very valuable learning. That's part of the reason to compare against older, smaller models since they're at a more comparable stage of development.

1 comments

vlovich123 455 days ago

I agree. As I was trying to imply, I think if you integrated this structure into OpenAI’s or Claude’s stack, you’d get a vastly cheaper model that’s significantly faster with similar task performance (modulo the structural task performance parts that are hard to port to this new architecture). The point about quality was also intended to temper some of the excitement about the scores published on the page.

link