Hacker News new | ask | show | jobs
by simon_luv_pho 105 days ago
Please use your own LLM api instead!

The free testing LLM is Qwen hosted by Aliyun. Qwen and DeepSeek are the only ones I can afford to offer for free. It's just there to lower the try-out barrier; please DO NOT rely on it.

The library itself does NOT include any backend service. Your data only goes to the LLM api you configured.

I tested it on local Ollama models it works fine.

1 comments

Or why not stay fully local with WebLLM... https://webllm.mlc.ai
That looks great! I also thought about calling the Gemini nano model embedded into Chrome (only extensions can do that). But after some testing on smaller models I found that anything smaller than 9b can’t really handle the complex tool call schema I use.

Qwen3.5 4b is quite good but still gives messy json quite often. But it’s very promising!

Maybe after one more model iteration or some fine-toning we can go fully embedded?