Hacker News new | ask | show | jobs
by hammyhavoc 1148 days ago
Here's another interesting idea: the marketing blurb mentions reducing costly errors—but with the amount ChatGPT hallucinates, how does this work? What steps have been taken to mitigate hallucination?
1 comments

LLMs do indeed hallucinate.

What’s been working well for us is to give it access to factual data sources such as calculators, component data sheets, etc.

It does really well from our tests in the past few weeks, so well that we decided to invite everyone to explore the boundaries with a community preview

Just give it a shot and report back

Can you elaborate on "really well"? Because a few weeks of testing and "really well" versus the claim of reducing costly mistakes on the marketing blurb whilst simultaneously acknowledging hallucination doesn't inspire confidence.

I'd be interested to know a lot more about testing methodology too. A hallucination could be a much more expensive problem than someone simply making a basic mistake that someone might otherwise catch if they were doing things the usual way.

Not trying to be obstinate, genuinely interested, but sceptical as my experience of ChatGPT has been very negative due to hallucinations, to the point that I find that the marketing materials sound far too good to be true versus the comment here, which seems much more grounded.

To give you an example of the typical response quality we encountered:

@copilot can you design a low pass filter for me with 3db point of 300khz?

copilot: Sure, I can help you design a low-pass filter with a 3dB point of 300 kHz. A simple passive RC (resistor-capacitor) low-pass filter can be used for this purpose. To calculate the values of the resistor (R) and capacitor (C), you can use the following formula:

f_c = 1 / (2 pi R * C)

Where f_c is the 3dB cutoff frequency (300 kHz), R is the resistor value, and C is the capacitor value. Let's choose a standard resistor value, for example, 1 kΩ. Then, you can calculate the capacitor value as follows:

C = 1 / (2 pi R f_c) = 1 / (2 pi 1000 300000) ≈ 530.5 pF

Since 530.5 pF is not a standard capacitor value, you can use the closest standard value, which is 560 pF. With these values, the actual 3dB cutoff frequency will be approximately:

f_c = 1 / (2 pi 1000 * 560e-12) ≈ 283.5 kHz

The low-pass filter can be assembled by connecting the 1 kΩ resistor in series with the input signal, followed by the 560 pF capacitor connected to the ground. The filtered output can be taken across the capacitor. This design will give you a first-order RC low-pass filter with an actual 3dB point of approximately 283.5 kHz.

And how about the negative responses involving hallucination? Ideally giving an example of the worst hallucination encountered thus far and least worst thus far.

As a potential customer, I'm more interested in knowing where the potential known failure points or points of friction are than knowing what's working well.

Yes, when it works, it's certainly compelling in its responses—but as it never knows when it is wrong, it is equally compelling at a cursory glance when it hallucinates.

For most folks interested in this, their Dunning-Kruger complex will prevent them from picking up on when it's wrong because it's so convincing and sure of itself.

AI is useful for making suggestions, but you shouldn't blindly trust it like that no matter how advanced it gets. It helps you work faster, not beyond your abilities.
That's the thing, I'm not actually sure it does help someone work faster if everything it suggests needs to be fact-checked whilst under the guise of being always confident and never knowing it is wrong.

A lot of people liken AI to having a junior role working with you, but most junior positions have been to university and have a pretty solid understanding of a field in the first place. I wouldn't have a junior role hallucinating things that don't exist on a regular basis.

The speed of generating an output is only as meaningful as how consistently correct and appropriate that output is in terms of its value. I remain unconvinced.

Please def report back if you encounter it to hallucinate

This is exactly why we are doing the community preview…to make it your favorite design tool.