Hacker News new | ask | show | jobs
by sdenton4 1013 days ago
On the modeling side, it's compelling to separate the memory from the linguistic skills. Vector search is hella fast and can be very good. So you can off load the memorization part of the problem, and let the language model focus on the language. This should allow better performance with much smaller models.