Hacker News new | ask | show | jobs
by ph1lw 534 days ago
Unfortunately no BYOLLM. Brave supports bringing your own LLM e.g. through Ollama

Besides that I'm using AI Summary Helper plugin for Chromium-based browsers https://philffm.github.io/ai-summary-helper/ which also allows using Ollama (or OpenAI / Mistral), asking questions to articles and inserting summaries right into the DOM (which is perfect for hoarding articles / forwarding them to Kindle)

2 comments

Sort of funny for the lack of local options considering that Mozilla funds llamafile even. Hopefully they allow some API integration, if they are using standard OpenAI API calls, it should be easy to enable swapping the endpoint.

Also, while it's nice to have a service option for those without any spare compute, I think it's a bit of a shame on the model considering how even at the 7B class, models like Llama 3.1 8B, Qwen 2.5 8B or Tulu 3 8B, Falcon 3 7B, all clearly outclass Mistral 7B (Mistral 7B is also very bad at multilingual, and is particularly inefficient at multilingual tokenization).

The current best fully open weights (Apache 2.0 or similar) small models currently are probably: OLMo 2 7B, Qwen 2.5 7B, Granite 3.1 8B, and Mistral Nemo Instruct (12B)

There's been a recent launch of a "GPU-Poor" Chat Arena for those interested in scoping out some of the smaller models (not a lot of ratings so very noisy, take it with a grain of salt): https://huggingface.co/spaces/k-mktr/gpu-poor-llm-arena

> Brave supports bringing your own LLM e.g. through Ollama

It's a shame Brave is so far ahead of the game but no one seems to notice.

I run Brave at home and a local LLM on the same machine, and didn't know this. I guess I'll be playing around this weekend.