|
|
|
|
|
by echelon
240 days ago
|
|
What models did you try to find tune? Were the models at the time even good enough to fine tune? Did they suffer from catastrophic forgetting? We have a lot of more capable open source models now. And my guess is that if you designed models specifically for being fine tuned, they could escape many of the last generation pitfalls. Companies would love to own their own models instead of renting from a company that seeks to replace them. |
|
One annoying part was switching to new and better models that came out literally every week.
I don’t think it substantially changes anything. If anything I think the release of more advanced models like qwen-next makes things like fp4, moe, and reasoning tokens an even higher barrier of entry.