Hacker News new | ask | show | jobs
by civilized 1142 days ago
Humans can be consistent if we try. LLMs can't, even when prompted to be consistent, because they don't really understand what it means to be consistent.
3 comments

Some humans can be consistent if we try some of the time.

Having done phone support in early parts of my career, I'd strongly dispute any notion that most humans can be consistent if we try for anything more than the shortest periods and following the very simplest of instructions.

Most people are really awful at maintaining the level of focus needed to be consistent, and it's one of the reasons we spend so much time drilling people on specific behaviours until they're it's near automatic instead of e.g. teaching people the rules of arithmetic, or driving, or any other skills and expecting people to be able to consistently follow the rules they've learnt. And most of us still keep making mistakes while doing things we've practised over and over and over.

LLMs are still bad at being consistent, sure, but I've seen nothing to suggest that is anything inherent.

I think one of the biggest issues with LLMs if anything is that they've gotten too good at expressing themselves well, so we overestimate the reasoning levels we should expect from them in other areas. E.g. we're not used to an eloquent answer from someone unable to maintain coherent focus and step by step reasoning because human children don't learn to speak like this before we're also able to reason fairly well, and that makes it confusing to deal with LLMs where relative stage of development of different skills does not match what we expect.

> Humans can be consistent if we try.

That's a bold statement. Do you have evidence for this?

driving in traffic only works because many other people's actions can be consistently understood.
But accidents happen in traffic all the time
This is not as strong of an argument as you seem to think.

According to NHTSA [0] there are about 2.1 accidents per million miles driven. This includes fatality, injury-only and property-damage-only accidents. That is the equivalent of over 99.999% of miles driven without an accident. Over 5 nines of reliably consistent behavior.

[0] https://cdan.nhtsa.gov/tsftables/National%20Statistics.pdf

This is a fair point but if what we're looking for is consistent behavior I think we'd have to consider events that don't result in damage or even rise to being accidents (but that could have, if fortune had frowned) like being cut off in a merge or someone running a no-turn-on-red. Which is of course difficult to really measure.
Better to say, I think, that humans are much better at improving their approximation of consistency with mental effort both because we can think silently instead of "out loud, step by step" and because some of the patterns of careful thought we engage in don't get written down naturally as text and so are unlikely to be hit upon by GPTs. The advantages of not being a strict feed foreword network.

That being said, GPT4 can just open its virtual mouth spew forth without reflection and still produce consistent text is clearly superhuman.

I think framing it this way could actually help us reach the next step with AI, if we ask how we could imbue it with those properties.

A human neural net is constantly bombarded with inputs from many different senses, which firstly, gets prioritized based on prior usefulness. That usefulness is updated all the time, constantly, and that's what any current AI implementation lacks.

1 - The continuous integration of data from all "senses". This one should be self-evident, as obviously, all our senses are constantly barraging our brain with data and it learns to handle that over time, in whichever way your genetic makeup + learned internal cognitive processes dictate it be handled.

2 - The network that decides which data requires which amount of attention, and whether to store it in short or long term memory. This is obviously tied in quite closely to 1, as you need massive amounts of data to understand underlying patterns, and which data is just spam, versus what's really valuable.

3 - And with these two things together come the emergence of improving of approximation of consistency. Which means, this itself is a metric which the agent running the other agents needs to be aware of. Its silly to think the human brain is a single agent. It makes way more sense to see it as various interacting agents that equate to a greater sum than its parts.

Now, that being said, I'm not an expert on AI or Data Science, but this is more or less how my understanding of computational theory of mind meets with biological computing and neural networks. My theory is that, the first actually intelligent AI will be one that is composed of a network that makes decisions on how to spend a unit of iteration. One iteration becomes "an instant" to the AI. Aka, the AI decides to spend one iteration thinking, or spawns a sub-agent (which it is aware will consume resources that other processes might also need access to, but it needs to be able to decide which action to pursue).

So in all honesty, its amazing to me that LLM's on their own have been able to achieve this level of "personhood" despite their being only a tiny subset of the whole that makes up a "conscious" entity.

Edit: Misunderstood parent's point.

Right. I sort of worry that because LLMs are able to be so coherent without anything resembling a human's working memory at all, then adding either a crude working memory still working with LLM tokens or adding a more sophisticated one assembled from "chunks"[1] will let an AI based on them ascend to clearly superhuman reasoning with just an architectural improvement and no need for any more flops invested than we already put into GPT4.

[1]https://en.wikipedia.org/wiki/Chunking_(psychology)

Mathematics exists, we can build consistent (up to a point of course, not everything can be free of contradiction) models that are pretty rigid.
Can you?
I'm not a mathematician, so I'm not at the horizon of mathematics - but I can branch off from what's already there to make what I need (and isn't just searchable) consistent with the rest of it.

You're arguing that recent AI developments are a big deal and I'm not arguing against that. But what anyone stating that needs to answer is why we should think that big deal is a good thing - since humans are using the technology and will control whom it benefits. We don't have a good track record there, historically, which is another are in which we are, sadly, consistent.

Don't large sectors of the economy rely on precisely consistent behavior to operate?
Economics depends on looking at large enough datasets that inconsistencies can be averaged it and glossed over. People generally following a pattern and precise consistency are very different
I don’t think another bold statement, phrased as a question, is what most people would consider “evidence.”
I'm phrasing it as a question because I'm asking you to do a small amount of mental legwork to observe the society around you.

How does the financial industry work? How are people comfortable executing transactions?

Do you generally rely on your bank account to not randomly fluctuate in its balance without cause?

Do you work in the tech industry? Do you rely on computers and algorithms and software to do things humans promised they would do?

All of this requires a very high level of consistency either in humans or in tools they have created.

To me, none of what you mention requires a whole lot of consistency other than in as much as we understand that we are horribly inconsistent we have a whole lot of ceremony and processes built up around what we do to mitigate the wild inconsistencies in the quality of work we do.
It's extremely interesting to me that all the examples here only point to humans being consistent in faith of the systems and tools we build
> How does the financial industry work? How are people comfortable executing transactions?

Because there are serious checks and balances? Because of double entry bookkeeping, reconciliation, audits and, ultimately, prisons? Systems of governance put into place with the explicit goal of ameliorating the vagueness of individual humans?

Aren't you mistaking the state and ability of being consistent, with the incentives for consistency? All these systems of governance would ultimately be useless without the ability to conform to them. And conform we do.

Maybe the ambiguity in this discussion is how to distinguish consistency from conformism?

Very high = ?%?
No, you can't, in any meaningful sense. Even the biggest bigots, with the most stable beliefs, like RMS or the Pope have a lot of contradictory beliefs. (At least, I think so. I don't have any evidence for this.)

Also any strategy that you might came up and is simple enough for you to follow is trivially followable for chatGPT as well.

Logical consistency is not the same thing as probabilistic consistency.