|
|
|
|
|
by ca_tech
949 days ago
|
|
They do mention that their expectation is that the 70B model will provide even better performance. I expect that you are correct and that they determined the 13B was capable enough to serve as a base model. Why incur additional training time before getting preliminary results. |
|