Hacker News new | ask | show | jobs
by ashishheda 129 days ago
Wonder how it works?
1 comments

High level, rolling buffer that uses the spare compute we're allocated for a conversation to achieve <80ms p50 results, using signals labeled from raw convo data to align a small language model to produce these natural language descriptions