|
|
|
|
|
by satisfice
247 days ago
|
|
When someone claims that a system can search “approximately” or “semantically” that means there some sort of statistical behavior. There will be error. That error can be systematically characterized with enough data. But if it can’t or isn’t, then it’s a toy. A problem I have with LLMs and the way they are marketed is that are being treated as and offered as if they were toys. You’ve given a few tantalizing details, but what I would really admire is a link to full details about exactly what you did to collect sufficient evidence that this system can be trusted and in what ways it can be trusted. |
|
In general, when using LLMs, there are no formal guarantees on output quality anymore (but the same applies when using, e.g., human crowd workers for comparable tasks like image classification etc.).
Having said that, we did some experiments evaluating output accuracy for a prior version of ThalamusDB and the results are here: https://dl.acm.org/doi/pdf/10.1145/3654989 We will actually publish more results with the new version within the next few months as well. But, again, no formal guarantees.