Hacker News new | ask | show | jobs
by deepak_heatm 578 days ago
Sad.. If ollama has limitations with functional calling, can we try NexusRaven-13B, Functional calling Mistral 7B ?
1 comments

In theory , yes we could, but would it yield "good enough" results for a "testing" agent- Probably not. The LLM here is actually not just responsible for tool calling, its also doing other intricate things such as planning the next steps based on the input feature file, and generating the browser/API automation code. In our experiments we found that OpenAI 4o performs best, followed by Haiku or Grok.