Hacker News new | ask | show | jobs
by huijzer 1306 days ago
> Use some sort of instruction tuning to get the thing "good enough" that it gives decent results 75% of the time and the other 25% a human has to take over.

How does the model know when a human has to take over?

I think most extrapolations of current "AI" capabilities into future capabilities are fun and useful in some ways, but also doomed to fail. It's very easy to miss a tiny detail which may in practice be a fundamental problem.

> Use the actual usage data as training input.

Given that those bigger state-of-the-art models train on terabytes of data, how would you know how much training data to generate to sufficiently change the output?

My understanding of "AI" is that it's mostly about some very complex models which are capable of solving previously unsolvable problems. However, those problems are always extremely specific. Going the other way of thinking of problems or future possibilities first and then applying "AI" to it is likely to fail.

3 comments

Much of the time knowing the human has to take over isn't one of the more difficult problems: the AI can't map the user input to a possible continuation with any high probability, or the AI interprets the user input as an expression of frustration or an assertion it's wrong.

The challenge is when AI has to interpret questions about stuff which can be expressed in syntactically similar ways with very different or precisely opposite meanings so it's very confidently (and plausibly) wrong about stuff like price changes and tax, event timings, refunds etc.

AI should be watched by a AI critic (or a AI guard), which goal is to detect harmful, dangerous, stupid, surprising behavior and raise alarm.

For example, image generators are watched for NSFW content by a separate AI critic.

> How does the model know when a human has to take over?

It’s incredibly easy, you ask “did this answer solve your issue?” and add a max_tries.

> … how do you know how much training data to generate …?

You don’t, you keep doing it until the results improve to meet your goals, or they stop short and you switch tactics.

That will work incredibly well for the self-driving AI cars that lost control at speed. /sarcasm Not all problems in life have the opportunity to be retried more than once.
You are maybe not thinking flexibly enough.

That’s why they have a fleet (parallelization) and why they outfitted the cars with sensors before self-driving was a thing (so they could simulate decision-making and have it corrected by driver action).

Their customers’ feedback absolutely trained their models.