| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by lovelearning 85 days ago

You're right that LLMs do spit out false information or wrong knowledge. I've experienced them too.

But a redeeming quality is that we can ask the same LLM to fact check its own answer step by step in real time with little effort. They often identify their own hallucinations and reduce the probability of retaining that mistake in the rest of the conversation.

This isn't easy with human sources. The effort to fact check without LLMs or ask the sources to fact check themselves are both higher. So it's often not done at all.

We also often ignore subtle but very common biases in human media sources [1], which create other types of errors like omissions and euphemisms which have been no less harmful than LLM hallucinations. The case of the Iraqi WMDs of Iraq and the NYT's dispersal of that disinfo, for example [2].

Regarding valuable information and rehashing, we probably shouldn't equate between all the things LLMs can do, and AI-generated articles. The quality of the latter may be entirely due to the lack of interest, attention, and cost concerns of whoever generated the article. Anecdotally, I have often found valuable knowledge and obscure connections by using deep research tools with careful prompts.

Lastly, if you're frequently finding something new from human-written sources, and LLMs are being trained on most of those same sources, isn't it logical that the latter will also likely output that same information?

This is why I feel human and AI sources are probably best used as complementary tools. Neither set of sources are perfect but each set has its strengths. By using both, we can get closer to an objective truth than using only one of them.

[1]: https://gipplab.uni-goettingen.de/wp-content/uploads/2022/04...

[2]: https://www.theguardian.com/media/2004/may/26/pressandpublis...

3 comments

dalke 85 days ago

The effort to fact check with LLMs is also high. Here's one from a few days ago.

Someone used AI to generate an image in the style of a Charles Schulz Peanuts cartoon.

Someone else observed that there were 5 fingers on the characters, and quoted as Google AI as saying “Charlie Brown, along with other Peanuts characters, is generally depicted with four fingers on each hand (three fingers and one thumb) ...”

Yet if you go to the Wikipedia entry at https://en.wikipedia.org/wiki/Peanuts you'll see the kids have 5 fingers. Or take a look at the actual cartoons. Or read the TVTropes entry https://tvtropes.org/pmwiki/pmwiki.php/Main/FourFingeredHand... under "Comic Strips".

Fact checking this with human sources is easy and not ambiguous. While LLMs are being trained that many cartoon characters only have a thumb and three fingers - it is a trope for a reason - so isn't it logical for LLMs to give the wrong answer for a comic where the human characters are actually drawn with 5 fingers?

My experience with LLMs is they keep getting things wrong, when details matter.

Do you ask the LLM to fact check everything? (In which case, why isn't that part of the standard prompt?) Or do you only ask to fact check things where you are unsure about the answer? (In which case, is it the algorithm telling you what you want to hear?) When do you stop the fact checking?

link

lovelearning 84 days ago

> When do you stop the fact checking?

Exactly the same calculus as fact checking anything else from any other source. What are the social/economic/ethical consequences to me if the answer is wrong or inaccurate or incomplete? How much time do I have to check? How thorough should I be?

I imagine this calculus isn't really that different for most people. Or is it?

As for your example, I believe it. But I also feel it's a rather outlier example involving image comprehension of an obscure factoid. That isn't typical of how I use LLMs which is mostly as text-based question answering engines and not what I had in mind when writing the comment.

I guess LLMs for image comprehension need a much higher level of skepticism.

link

dalke 84 days ago

Well, in my case going to a Peanuts comic and looking at hands was pretty easy, and didn't involve any questions about negative environment or labor consequences, the massive hammering of web sites to gather data, centralization of power, and the like.

Like, "!w Peanuts" in my search bar, look at the image, and count fingers.

"a rather outlier example"

You wrote that you use AI to find "obscure connections" - aren't those all by definition outliers?

"mostly as text-based question"

I just now asked Google AI "how many fingers are on charlie brown's hand?"

It replied "In the Peanuts comic strip, Charlie Brown and the rest of the gang are traditionally drawn with four fingers (or three fingers and a thumb) on each hand."

No image comprehension, exactly as you had in mind. And completely false.

And that's from a training corpus which almost certainly includes statements that the kids are drawn with 5 fingers, since I confirmed that info on TVTropes and Reddit comments, like https://www.reddit.com/r/pics/comments/swod8/charlie_brown_h... .

link

lovelearning 84 days ago

HN isn't showing me a reply option for your latest comment, so I'll reply here instead.

Just to clarify, I used plain Google search not Google AI mode. And opened search results which seemed "reputable," without knowing anything much about Peanuts cartoon or cartooning.

I had no idea at all about archive.org having it and didn't see it listed in the first two pages of search results.

I still find it confusing, especially given what the Variety.com link says which doesn't mention orientation. If the acceptable explanation for 4 vs 5 is orientation, why is it wrong when the AI generated 4 fingers? Does it not match the rest of the orientation?

Anyway, I'm not sure where this leaves LLMs. I'll explore image capabilities when I get some opportunity and keep your comment in mind.

link

dalke 84 days ago

The comment about using Google was more a curiosity. I hadn't seen the Variety link until yesterday, when I went to Google to reproduce the answer to verify it was from a text query, not an image query. Both Google AI and one of the top answers included that Variety link. When you mentioned it again, it strongly suggested you were using Google as your primary search method.

I think the right way to interpret the Variety link is that it's a single paragraph about trying to capture the feel of the comic using 3D software. As you saw from Charlie Brown holding a baseball, Shulz didn't go for a realistic look, but still conveys the sense of grasping. Modeling all five fingers all the time would not give the movie the right feel.

I wonder now if Google AI incorporates text from the top results into its answer.

"why is it wrong when the AI generated 4 fingers?"

The original discussion was when person X used AI to generate a image "in the style f Charles Shulz" where the Peanuts characters had 5 fingers, then person Y noted the use of 5 fingers instead of the 4 which is common in comics and cartoon, and quoted Google AI as saying Peanuts was traditionally drawn with 4 fingers.

I yesterday verified that Google AI would generate the same wrong answer with a text query, so it was not an image interpretation issue.

FWIW, after looking at a few hundred Peanuts cartoons, I can confidently say the AI generated image was not in the style of Schulz. The generated fingers were too realistic, and the background too complicated and detailed. :)

This for me is another example of why using primary sources should be the first thing to consider when fact checking - not LLMs (my experience is they are horrible at details), and not secondary sources (which have their own biases).

Not everything has easily-accessed primary sources, but many do. I think it's all too easy to fall into the trap of accepting the LLM answer because it feels right and is easy to generate. At https://freethoughtblogs.com/stderr/2025/01/18/ai-art-just-r... you'll see someone asked about which river Marbot swam across to spy on the enemy camp. It replied "Elbe". Then I did a text search of an English translation of the book and found he used a boat to cross the Danube to spy on the enemy camp, and he swam into freezing waters to save an enemy soldier.

Again, do you ask the LLM to fact check itself every single time? If that's useful, why isn't it built into the prompt? Or, if you are supposed to double-check the LLM yourself, why would you consult a secondary source if the primary source is so easy to find and search? And in that case, why not just use the primary source?

Further, if you aren't in the habit of checking primary sources then you won't have the experience to know how to find and check primary sources.

link

lovelearning 84 days ago

Even as a human, I find whatever sources Google shows to be inconsistent. I can't give any confident answer about the number of fingers. I think the answer is actually "4 sometimes and 5 other times."

So I'm not sure how much LLMs can handle this kind of inconsistency between "reputable" visual sources and text sources, nor how representative this example is.

A "reputable source" like Variety says this...

https://variety.com/2015/film/spotlight/charlie-brown-steve-...:

> “The rig would automatically move the features around so it would match the way Charles Schulz drew the character,” Heller says....In some drawings, Charlie Brown has just three fingers, while in others, he has five

Images from another website...

https://cartoonresearch.com/index.php/cartoons-at-bat-part-1... :

1. https://cartoonresearch.com/wp-content/uploads/2025/09/Lost-... -> 4 fingers

2. https://cartoonresearch.com/wp-content/uploads/2025/09/image... -> 4 fingers

Anyway this wasn't the type of obscure connections I was referring to though I can understand you interpreting it that way.

Personally I think this example supports what I said about "reputable sources." They can't be blindly trusted either because they may be inconsistent with each other and which one we choose to believe (Reddit.com or TVTropes.com or Variety.com) becomes entirely subjective.

link

dalke 84 days ago

Your first link was cited in the 2nd half of Google AI's answer, and one of the top Google answers, so I think you are using Google as your information source.

The large majority of the images you link to show kids with 5 fingers, as well as 5-fingered baseball gloves. The cases of four fingers are due to orientation.

Your "1." also shows Marcie with five fingers. You see Charlie Brown with 4 fingers because he's holding a baseball. In 2. he's also holding a baseball. You would not see 5 fingers on one side because doing so would look strange.

In your unlabeled "0." there are plenty of kids with 5 fingers. There are some with fewer, but they are holding things or drawn in way to suggest we are seeing the hand from the side.

I don't understand your hesitancy. Your own samples should be enough for you to decisively conclude that the Google AI's claim that Peanuts was "traditionally drawn with four fingers (or three fingers and a thumb) on each hand" is wrong. If not, it sure seems like you trust Google AI over your own eyes. Why are you so hesitant to agree?

My point is that you don't need to consult secondary sources when the primary sources are easily available.

When this came up a few days ago, I spot checked the complete works of Peanuts, from a collection on archive.org at https://archive.org/details/peanutscomics19502000/Volume%201... . The consistent pattern across the nearly 50 years of Peanuts is the kids have five fingers unless obscured by orientation or objects.

You can do that yourself, and triple-check that Google AI's answer is clearly wrong.

Thus, I think it's a good example of how fact checking with LLMs can lead people astray, and the large negative externalities I mentioned combined with its well-known tendencies to make incorrect statements make it a very poor starting point when the primary source, at least in this case, is so easy to access.

If most of the sources are wrong, and LLMs are being trained on those, isn't it logical that the latter will also likely output that same wrong information?

When do you know if most of the sources are wrong, unless you yourself know most of the sources are wrong?

link

thisischayan 74 days ago

We were facing the same challenge and had to build something that delivers consistent, near-99.99% accuracy — it’s called LiveFix (livefix.ai).

It’s a drop-in proxy between your app and your LLM. Every response is corrected during generation, not after. One API call. No retries.

Each response returns with a trust status: *verified*, *needs_review*, or *requires_human* — no silent failures.

We’re seeing a ~99% pass rate across thousands of clinical documents. Budget models are matching premium-level accuracy at ~75% lower cost. Benchmarked against top-tier budget and frontier models, with performance improving across the board — benchmarks are published.

link

esalman 84 days ago

> But a redeeming quality is that we can ask the same LLM to fact check its own answer step by step in real time with little effort. They often identify their own hallucinations and reduce the probability of retaining that mistake in the rest of the conversation.

It is not really a redeeming quality of LLM. It falls apart because if you want to properly fact check an LLM, you have to make yourself aquinted on human-generated contents- facts, arguments, and biases from "both sides" when it comes to complex topics. What's the point of relying on LLM if you are having to do that anyways to reach objective truth?

Case in point- someone in a tweet used LOTR reference to the US/Israel vs Iran conflict, equating Iran to Sauron and US/Israel to the good guys. I asked Grok if that is a valid parallel. In the beginning, Grok agreed with the tweet. Then I had to point out how Iran has been weakened by decades of sanctions by US, it started to agree that US might be more like Sauron's army. Grok time and again used Islamic regime's treatment of it's own people as an evidence that Iran might be the evil force, but when I cited Israel's treatment of other Arab civilians, or the Shah regime's treatment of it's own people back in the day and it's allegience to US, it changed it's stance again. When I cited it's influence on US policies, Grok even admitted that Israel might be the Wormtongue here.

This also shows how LLM totally falls apart as a way to reach objective truth, because you can manipulate it to say anything that you want. I could easily present evidence from my own research and make Grok reach completely opposite conclusions.

link

lovelearning 84 days ago

Gemini (currently) gave this reasonable and unbiased answer that challenges both pro-US/Israeli and pro-Iranian biases.

It's also a good point for the OP - when selecting an LLM, ask the people they're dealing with to ask 2-3 LLMs. It's much faster (and cheaper!) than reading about a complex geopolitical issue (that may not matter to them enough to spend a lot of money on books and research papers) while at the same time correcting common biases on all sides.

----------

> Someone said in the ongoing US/Israel vs. Iran conflict, Iran is like Sauron from the Lord of the Rings and US/Israel are like the good guys. Is this a valid parallel given the history of US, Israel, and Iranian regimes?"

Comparing real-world geopolitical conflicts to fictional narratives like The Lord of the Rings is a common rhetorical device used to frame complex situations in terms of absolute good versus absolute evil. However, determining whether this is a "valid" parallel depends heavily on one's political perspective, as real-world geopolitics rarely aligns perfectly with the clear-cut morality of fantasy literature.

> Historians, international relations scholars, and observers generally view such analogies as overly simplistic, as the histories of the United States, Israel, and Iran are marked by competing interests, shifting alliances, and moral complexities.

Here is a breakdown of how different perspectives view this analogy:

*The Perspective Aligning with the Analogy* Those who support the analogy framing Iran as the villain ("Sauron") and the US and Israel as the "good guys" often point to the current actions and rhetoric of the Iranian government: * *Regional Destabilization:* The Iranian government heavily funds and arms the "Axis of Resistance," a network of proxy militias including Hezbollah in Lebanon, Hamas in Gaza, and the Houthis in Yemen. These groups have engaged in prolonged armed conflicts with Israel and targeted international shipping. * *Existential Rhetoric:* The leadership of the Islamic Republic of Iran frequently uses rhetoric calling for the destruction of Israel and the downfall of the United States (often referring to them as the "Little Satan" and the "Great Satan"). * *Human Rights:* Iran's theocratic regime is heavily criticized by international human rights organizations for its severe suppression of internal dissent, lack of democratic freedoms, and treatment of women. * From this viewpoint, the US and Israel are democratic nations acting defensively to protect their citizens, regional stability, and global security from an aggressive, authoritarian regime seeking regional domination and nuclear capabilities.

*The Perspective Challenging the Analogy* Critics of this analogy argue that casting the US and Israel as flawless "good guys" and Iran as an irredeemable evil ignores significant historical context and the controversial actions of Western and Israeli governments: * *Historical Interventions:* Critics point to the 1953 Iranian coup d'état, in which the US and UK intelligence agencies orchestrated the overthrow of Iran’s democratically elected Prime Minister, Mohammad Mosaddegh, to reinstate the authoritarian Shah. This event deeply scarred the Iranian psyche and laid the groundwork for the 1979 Islamic Revolution and the regime's current anti-Western stance. * *US Foreign Policy:* Observers critical of the US point to its history of military interventions in the Middle East—such as the 2003 invasion of Iraq—which resulted in hundreds of thousands of civilian casualties and regional destabilization. * *The Israeli-Palestinian Conflict:* Critics of Israel point to the prolonged military occupation of Palestinian territories, the expansion of settlements, and the high civilian death tolls in conflicts like the ongoing war in Gaza. From the perspective of Iran and many in the Global South, Israel is viewed as an oppressive or colonial force, and the US as its imperial enabler. * From this viewpoint, Iran frames itself not as an aggressor, but as the leader of a resistance movement pushing back against Western hegemony and defending marginalized populations in the Middle East.

*The Analytical Perspective* In The Lord of the Rings, Sauron is a localized, supernatural force of pure malice with no valid grievances, while the coalition fighting him represents the preservation of life and freedom.

Geopolitical analysts generally avoid mapping this binary onto international relations. Instead, they view the US/Israel vs. Iran conflict through the lens of realpolitik and state interests. In reality, all states involved are acting to secure their own survival, project regional power, and protect their economic and security interests. While individuals and governments may hold strong moral convictions about which side is justified, framing the long, deeply intertwined history of these nations as a simple battle between absolute good and absolute evil omits the historical grievances and civilian suffering experienced on all sides.

link

esalman 84 days ago

What I'm hearing is that if you are dealing with something that matters to you a lot, you should avoid relying on LLMs.

link