Hacker News new | ask | show | jobs
by babyent 523 days ago
This was a fun read. I’m not a AI expert by any means. I’m also ESL. Please bear with me.

However the inaccuracy threshold seems fine for a museum, but in enterprise operations inaccuracy can mean lost revenue or worse lost trust and future business flow.

I’m struggling with some more advanced AI use cases in my collaborative work platform. I use AI (LLMs) for things like summarizations, communication, finding information using embedding. However, sometimes it is completely wrong.

To test this I spent a few days (doing something unrelated) building up a recipes database and then trying to query it for things like “I want to make a quick and easy drink”. I ran the data through classification and other steps to get as good data as I could. The results would still include fries or some other food result when I’m asking for drinks.

So I have to ask what the heck am I doing wrong? Again, for things like sending messages and reminders or coming up with descriptions, and finding old messages that match some input - no problem.

But if I have data that I’m augmenting with additional information (trying to attach more information that maybe missing but possible to deduce from what’s available) to try and enable richer workflows I’m always being bit in the butt. I feel like if I can figure this out I can provide way more value.

Not sure if what I said makes sense.

1 comments

> Not sure if what I said makes sense.

Not sure either. But here is the lesson from this and other sources. To improve the output use multistep approach. Get the first answer, one or more, and pass it through the second verification step(s). Like 'for this * is this *' relevant? Or is it correct, does it solve the problem, etc.. Then select the answer with the best scores on the filters. You see, it's very similar to that in the original post. Get first candidates, filter.

Any way to speed it up? So of course I can try (keyword) to improve the results, but it all comes down to how well the embedding is stored, the quality of the actual user content, and of course my prompt.

I have a multi-step workflow already, but it is getting slow now going through all those steps. And sometimes, if the result is wildly incorrect, it feels really bad.

Again, I'm not an expert in this field but I am trying to learn and improve my product.

You can try using online services with best models. With prices like $1 for 10K requests it's hard to make local competitive solution. I'm going to use this approach in my personal projects.
Great! Same I’m using OpenAI models, not local. What about you?