| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by swaminarayan 109 days ago
	How are you doing semantic end-of-turn detection without adding latency to the critical path? Is it a separate lightweight model or integrated into the LLM stream?