|
I'm more interested in technical side of this, but I'm not seeing any links to GitHub with the source code of this project. Anyway, I have a tangential question, and this is the first time I see langchain, so may be a stupid one. The point is the vendor-API seems to be far less uniform than what I'd expect from a framework like this. I'm wondering, why cannot[0] this be done with Ollama? Isn't it ultimately just system prompt, user input and a few additional params like temperature all these APIs require as an input? I'm a bit lost in this chain of wrappers around other wrappers, especially when we are talking about services that host many models themselves (like together.xyz), and I don't even fully get the role langchain plays here. I mean, in the end, all that any of these models does is just repeatedly guessing the next token, isn't it? So there may be a difference on the very low-level, there my be some difference on a high level (considering different ways these models have been trained? I have no idea), but on some "mid-level" isn't all of this utlimately just the same thing? Why are these wrappers so diverse and so complicated then? Is there some more novice-friendly tutorial explaining these concepts? [0] https://python.langchain.com/v0.2/docs/integrations/chat/ |
The APIs between different providers are actually pretty similar, largely close to the OpenAI API.
The reason to use a paid service is because the models are superior to the open source ones and definitely a lot better than what you can run locally.
It depends on the task though. For this task, I think a really good small model like phi-3 could handle 90-95% of the entries well through ollama. It's just that the 5-10% of extra screw ups are usually not worth the privilege of using your own hardware or infrastructure.
For this particular task, I would definitely skip langchain (but I always skip it). You could use any of the top performing open or closed models, with ollama locally, together.ai, and multiple closed models.
It should be much less than 50 lines of code. Definitely under 100.
You can just use string interpolation for the prompts and request JSON output with the API calls. You don't need to get a PhD in langchain for that.