Hacker News new | ask | show | jobs
by boredtofears 164 days ago
This article makes a lot of definitive claims about capabilities of different models that don't align with my experience with them. Its hard to take any claim serious without completely understanding the state of the context when the behavior was observed. I don't think its useful to extrapolate a single observation into generalized knowledge about a particular model.

Can't wait until we have useful heuristics for comparing LLM's. This is a problem that comes up constantly (especially in HN comments...)