|
this proves that all llm models converge to a certain point when trained on the same data. ie, there is really no differentiation between one model or the other. Claims about out-performance on tasks are just that, claims. the next iteration of llama or mixtral will converge. LLMs seem to evolve like linux/windows or ios/android with not much differentiation in the foundation models. |