Hacker News new | ask | show | jobs
by Florin_Andrei 117 days ago
I think we're still at the stage where model performance largely depends on:

- how many data sources it has access to

- the quality of your prompts

So, if prompting quality decreases, so does model performance.

3 comments

Sure, but the study is saying something slightly different, it's not that people write bad prompts for artifacts, they actually write better ones (more specific, more examples, clearer goals,...). They just stop evaluating the result. So the input quality goes up but the quality control goes down.
Seems like it’s impossible for output to be good if the prompt is bad. Unless the AI is ignoring the literal instructions and just guessing “what you really want” which would be bad in a different way.
> On two occasions I have been asked, — "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" In one case a member of the Upper, and in the other a member of the Lower, House put this question. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.

- Charles Babbage, https://archive.org/details/passagesfromlife03char/page/67/m...

EDIT: This is a new iteration of an old problem. Even GIGO [1] arguably predates computers and describes a lot of systemic problems. It does seem a lot more difficult to distinguish between a "garbage" or "good" prompt though. Perhaps this problem is just going to keep getting harder.

1. https://en.wikipedia.org/wiki/Garbage_in,_garbage_out

What does prompting quality even mean, empirically? I feel like the LLM providers could/should provide prompt scoring as some kind of metric and provide hints to users on ways they can improve (possibly including ways the LLM is specifically trained to act for a given prompt).
That would be a quality metric, and right now they are focused on quantity metrics.