| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sanchitmonga22 99 days ago
	That's a fair read. Tool calling reliability with sub-4B models is genuinely the hardest unsolved problem in on-device AI right now. The inference engine (MetalRT) is production-grade, the pipeline architecture is solid, but the models at this size are still the weak link for complex tool routing. Larger model support (where tool calling is much more reliable) is next on the roadmap. Please stay tuned!

1 comments

Sorry, I scrolled through some of the rest of the comments on this thread and can’t stay tuned.