Hacker News new | ask | show | jobs
by jrm4 933 days ago
Your skepticism is, I think, very well founded -- especially with such unclear definitions of "improvement."

I think I have a corollary type idea: Why are LLM's not perhaps like "Linux," something than never really needs to be REWRITTEN from scratch, merely added to or improved on? In other words, isn't it fair to think that LoRA's are the really important thing to pay attention to?

(And perhaps, like Google Fuschia or whatever, new LLMs might just be mostly a waste of time from an innovators POV?)

1 comments

When in comes to training LLMs, the definition of “improvement” is incredibly clear, as one must literally code a loss function that the model then minimizes.

It gets murkier trying to map that actual capabilities, but so far, lower loss has led to much stronger capabilities.