|
|
|
|
|
by jpgvm
1186 days ago
|
|
I think their realisation was that the models themselves aren't actually that difficult to replicate even in absence of patent or description. i.e they can't adequately defend their business with trade secrets. Patents probably wouldn't work either because the structures are too easily recombined to bypass any conceivable patent that would be enforceable. That said I think all of this is actually emblematic of a deeper problem with the space which is that none of the LLM stuff recently has been groundbreaking but rather just just continual refinement of a given branch. We aren't seeing evolution, just increasing either number of parameters or quality thereof + additional context. Which is why it was so easy for other folks to make the same progress in similar time periods. Time will tell if we are about to slam into a local maxima or if someone finds a significant evolution or better yet stumbles on a way to properly combine LLM for context + NLP with traditional AI/logic/expert systems to engineer something that actually thinks and learns rather than regurgitating statistics. |
|