|
|
|
|
|
by ftxbro
1180 days ago
|
|
His estimate is that you could train a LLaMA-7B scale model for around $82,432 and then fine-tune it for a total of less than $85K. But when I saw the fine tuned LLaMA-like models they were worse in my opinion even than GPT-3. They were like GPT-2.5 or like that. Not nearly as good as ChatGPT 3.5 and certainly not ChatGPT-beating. Of course, far enough in the future you could certainly run one in the browser for $85K or much less, like even $1 if you go far enough into the future. |
|
My biggest problem: I haven't managed to get a great summarization out of a LLaMA derivative that runs on my laptop yet. Maybe I haven't tried the right model or the right prompt yet though, but that feels essential to me for a bunch of different applications.
I still think a LLaMA/Alpaca fine-tuned for the ReAct pattern that can execute additional tools would be a VERY interesting thing to explore.
[ ReAct: https://til.simonwillison.net/llms/python-react-pattern ]