Hacker News new | ask | show | jobs
by jjfoooo4 47 days ago
Simple vector similarity plus a cheap model to filter results works pretty well. Though ofc t does add tokens to your primary chat, which is the basic tradeoff of memory systems in general (in addition to latency)