| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tharkun__ 470 days ago

No straws to clutch here. I've made such and other functions available to LLMs in order to implement some great functionality that would otherwise not have been possible. And they do a relatively good job. One of the issues is that they're not really reliable / deterministic. What the LLM does / is capable of today might not be what it does tomorrow or with just ever so slightly different context added via the prompts used by the user today vs. yesterday.

You are correct in that the date thing by itself, if that was the only thing would not be such a big deal.

But the date thing and confidently telling me the wrong date is a symptom and stand-in example of what LLMs will do in way too many situations and regular people don't understand this. Like I said, not very intelligent / confident people will do the same thing. But with people you generally have a "BS meter" and trust level. If you ask a random stranger on the street what time it is and they confidently tell you that it's exactly 11:20:32 a.m. without looking at their watch/phone, you know it's 99.99% BS. (again, just a stand in example, replace with 'Give me timeline of the most important thing that happened during WWII on a day by day basis' or whatever you can come up with). Yet people trust the output of LLMs with answers to questions where the user has no real way to know where on the BS meter this ranks. And they just believe them.

Happened to me today at work. LLM very confidently made up large swaths of data because it "figured out" that the test env we had was using the Star Trek universe characters and objects for test data. Had no base in reality and it basically had to ignore almost all the data that we actually returned from one of these "Get the current date" type functions we make available to it.

Thanks LLM!