Hacker News new | ask | show | jobs
by embedding-shape 2 days ago
> Why didn't this author compare Llama 3 with GLM 5.2 (released 1 week ago) which is a more standard attention based LLM? To compare 2 separate families of LLMs and then pointing out that they are different is not a surprising result and detracts from the point the author is trying to make.

The entire point of the comparison is that LLMs look vastly different today than before. Comparing more similar LLMs would detract from the point I thought the author was trying to make.

1 comments

It is misleading the reader since most current LLMs look the same as before. It is cherry picking an example to make a point when it's not necessary at all to make the argument he is trying to make.
> It is misleading the reader since most current LLMs look the same as before.

But most of them do not? They do look vastly different from the earlier incarnations of GPT and Llama.

Can you provide an example? I provided an example of a week old model looking mostly the same.