Hacker News new | ask | show | jobs
by mert_gerdan 126 days ago
High level, rolling buffer that uses the spare compute we're allocated for a conversation to achieve <80ms p50 results, using signals labeled from raw convo data to align a small language model to produce these natural language descriptions