Hacker News new | ask | show | jobs
by spacebanana7 1147 days ago
> The thing about LLM is that it answers based on data it has seen before.

Modern LLMs are able to perform web searches to make decisions on contemporary data. Once they have proper API support your concerns should be resolved, hopefully in a few weeks.

> reliablity and safety issues.

The solution to this is fine tuning / RHLF. OpenAI have done a pretty extensive job at getting political safety for ChatGPT with RHLF. It seems reasonable that RHLF could achieve a similar result in the hardware domain.

> you can't ask questions you did't know you needed ....

Solvable by prompt engineering. You can wrap user input in a prompt. As a toy example: "Here is user input $userInput if you have safety concerns about their project please respond with questions you think the user forgot to ask". Might also be possible to tweak with fine tuning/RHLF.

1 comments

"RHLF" is Reinforcement Learning from Human Feedback? (Strange acronym.)

I don't see how that helps a a tool become useful to a very skilled person, if the "human" side of things is polluted by 95+% of users having very low skill. It's great that you can train LLMs on the world's best reference material! But I don't see how you can get the world's best updates into that training set without hiring the world's best experts. So the tool will have very little value for anyone above a certain skill ceiling. Search has already fallen victim to this effect (I'm tired of results pages full of beginner material when I have a deeper question!) and I cannot see this being better for augmentation training sets for LLMs.

It should be RLHF, my bad with the spelling.

> So the tool will have very little value for anyone above a certain skill ceiling.

LLMs aren't great for doing tasks you don't know how to do, because you'll eventually have to debug the output. However they excel at performing time consuming tasks that you could do if you really wanted.

That's why I think they'll actually be more useful for experts.