| HN Mirror

It is if the weights are sufficiently advanced.

blueflow 344 days ago

I find such statements frightening. Too many people can not tell the different between prevalence ("everybody does it") and factually correct.

Nothing to do with dice though.

blueflow 344 days ago

The whole "stochastic means to find factual correctness" thing is an error of method, arguing about weights here is nonsense.

It isn't though, the most factually correct human expert is also stochastic. The only question is how the dice are weighted.

blueflow 344 days ago

"human expert" as reference for "factually correct", oh just gently caress yourself. Appeal to authority (expert = social status) is as much bullshit as appeal to popularity.

The weights, so to speak, come from the knowledge base. That means you can't get away from the quality of the knowledge base. That isn't uniform across all domains of knowledge. Then the problem becomes how do you make the training material uniformly high-quality in every knowledge domain? At best it becomes the meta problem of determining the quality of knowledge in some way that makes an LLM able to calibrate confidence to a knowledge domain. But more likely we're stuck with the dubious quality that comes from human bias and wishful thinking in supposedly authoritative material.

Sure, it's only as good as the training data. But human experts also output tokens with some statistical distribution. That doesn't mean anything.

That sounds plausible. But it doesn't explain why LLM's make laughably bad errors that even a biased and haphazard human researcher wouldn't make.

Gemini seems to have a user interface that, for the way most people encounter Gemini, is more closely linked to search results. This leads me to suspect that Google's approach to training could be uniquely informed by both current and historic web crawling.

I think that's been a lot less true over the last year or so. Gemini 2.5 Pro is the first LLM I actually find pretty damn reliable.

contagiousflow 344 days ago

If you think talking to an LLM is the same experience as talking to a human you should probably talk to more humans

That's not what I said. What I said is that the claim "LLMs aren't intelligent because they stochastically produce characters" doesn't hold because humans do that too even if they're intelligent and authorative.

krapp 344 days ago

We don't actually know how human cognition works, so how do you know that humans "stochastically produce characters?"

nijave 344 days ago

MCP and agents seem like a solutions but as far as I know maintaining sufficient context is still a problem

I.e. ability to plug in expert data sources

Find tuning and RAG should, in theory, enable applications of LLM's to perform better in specific knowledge, domains, by focusing annotation of knowledge on the domains specific to the application.

JamesSwift 344 days ago

I think youre missing the point. The issue is not the amount of knowledge it possesses. The problem is that theres no way to go from "statistically generate the next word" to "what is your confidence level in the fact you just stated". Maybe, with an enormous amount of computation we could layer another AI on top to evaluate or add confidence intervals, but I just dont see how we get there wihthout another quantum leap.

Of course there is. If its training forces it to develop a theory of mind then it will weight the dice so that it's more likely to output "I don't know". Most likely the culprit is that it's hard to make training data for things that it doesn't know.