| HN Mirror

The article explains that:

> Infactory will pull information directly from trusted resources

Of course it means assuming anything the NYT says is true. Biden is sharp as a tack, it's a fact!

Nah, actually the article says they'll avoid politics. They want to be a better Bloomberg Terminal apparently, and only focus on quantitative data for business purposes. Basically OurWorldInData + LLMs.

In theory you can actually do a reasonable job of this sans Orwell. You train a model on a really wide selection of sources and then get it to spit out the knowledge that doesn't seem to have any disagreement within the dataset. The assumption is that no disagreement = fact. This heuristic isn't bad, but it's been tried before and is prone to errors in a few well known cases. Google tried it years ago, predating LLMs. It worked well for things like "how high is the Eiffel tower" but unsurprisingly one place it worked poorly is political ideology and terminology. Different political tribes often have their own ways of using language.

Example: "is George Bush a war criminal"? Turns out that the internet is full of documents asserting this to be true, and not many asserting it to be false. This isn't because it's a widely agreed fact. It's because the left believe the concept of war criminal makes sense and can be applied liberally, but the right doesn't. Presented with this statement the right tend to say, a criminal according to which court and which government? The left say according to international law which is again, a concept the right doesn't really recognize as being legitimate to begin with because they think that law inherently flows from the concept of a nation state or empire, not a group of NGOs.

At heart the "problem", if you want to call it that, is that the left is generally more passionate about politics and power so if they believe in a concept they do things like take poorly paid journalism jobs and write lots of articles that take for granted the legitimacy of their ideological precepts. The right do things like go work in banking or agriculture or oil, or indeed the tech industry, and don't end up with much time to spend arguing with them. So these concepts filter into the dataset without pushback. Deploy them in real world debate though, and suddenly you get that pushback.

(this seems to be one of the reasons that a naively trained LLM ends up super woke - the internet is just left biased due to the greater output of words from that tribe)

Fortunately there's a limited number of cases like this. The set of such cases does grow over time, but at a relatively slow rate. In theory, if you had people really and truly committed to neutrality and pursuit of truth, you could use LLMs to find claims that are both lacking in disagreement and also non-dependent on ideological disputed concepts. LLMs are actually pretty good at the sort of vagueness and nuance that understanding requires.

The problem is that such a program would be very boring and not commercially useful. Ironically, the very concept of fact checking is itself an "is George Bush a war criminal" type problem. The right take it for granted that reality is complex and depends on perspective, the left take for granted that reality is simple and can be painted in black/white, correct/incorrect. So the right doesn't spend much time on "fact checking" as a concept, because they see it as a quasi-illegitimate endeavor to begin with. From their POV there isn't actually much disagreement on things that are genuinely empirical facts like the speed of light or the price of USD:GBP yesterday at noon, so what fact checkers end up spending time on is in reality a sort of political censorship / propaganda operation aimed at shutting down any viewpoint they don't like.

So an AI dedicated to genuine checking of empirical facts would probably find that there isn't much to do. People are pretty good at agreeing on facts already, there are not that many errors to fix (and the errors that do sneak through are rarely important). An AI dedicated to the sort of fact checking that Snopes engages in would be very busy indeed, but there are plenty of people willing to work for peanuts to engage in ideological warfare against the right so where's their commercial edge? Seems like another business failure waiting to happen.