Hacker News new | ask | show | jobs
by tlack 1118 days ago
I am experimenting with building software using the ReAct tool prompting pattern, using Llama derivative models like Manticore13B, Airoborous, etc. I script it all together using Microsoft Guidance with Llama.cpp and AutoGPTQ. Works pretty well for simple tasks and I know the costs are roughly fixed. Obviously their capabilities are far less than OpenAI's products but when you have tens of thousands of conversations to have the costs of ChatGPT become a distraction. Haven't tried finetuning yet.