Hacker News new | ask | show | jobs
by badrunaway 721 days ago
Current architecture of LLMs focus mainly on the retrieval part and the weights learned are just converged to get best outcome for next token prediction. Whereas, ability to put this data into a logical system should also have been a training goal IMO. Next token prediction + Formal Verification of knowledge during training phase itself = that would give LLM ability to keep consistency in it's knowledge generation and see the right hallucinations (which I like to call imagination)

The process can look like-

1. Use existing large models to convert the same previous dataset they were trained on into formal logical relationships. Let them generate multiple solutions

2. Take this enriched dataset and train a new LLM which not only outputs next token but also a the formal relationships between previous knowledge and the new generated text

3. Network can optimize weights until the generated formal code get high accuracy on proof checker along with the token generation accuracy function

In my own mind I feel language is secondary - it's not the base of my intelligence. Base seems more like a dreamy simulation where things are consistent with each other and language is just what i use to describe it.

7 comments

This suggestion revisits the classic "formal top-down" vs "informal bottom-up" approaches to building a semantic knowledge management system. Top-down has been tried extensively in the pre-big-data models and pre-probabilistic models era, but required extensive manual human curation while being starved for knowledge. The rise of big-data bode no cure for the curation problem. Because its curation can't be automated, larger scale just made the problem worse. AI's transition to probability (in the ~1990s) paved the way to the associative probabilistic models in vogue today, and there's no sign that a more-curated more-formal approach has any hope of outcompeting them.

How to extend LLMs to add mechanisms for reasoning, causality, etc (Type 2 thinking)? However that will eventually be done, the implementation must continue to be probabilistic, informal, and bottom-up. Manual human curation of logical and semantic relations into knowledge models has proven itself _not_ to be sufficiently scalable or anti-brittle to do what's needed.

> How to extend LLMs to add mechanisms for reasoning, causality, etc (Type 2 thinking)?

We could just use RAG to create a new dataset. Take each known concept or named entity, search it inside the training set (1), search it on the web (2), generate it with a bunch of models in closed book mode (3).

Now you got three sets of text, put all of them in a prompt and ask for a wikipedia style article. If the topic is controversial, note the controversy and distribution of opinions. If it is settled, notice that too.

By contrasting web search with closed-book materials we can detect biases in the model and lacking knowledge or skills. If they don't appear in the training set you know what is needed in the next iteration. This approach combines self testing with topic focused research to integrate information sitting across many sources.

I think of this approach as "machine study" where AI models interact with the text corpus to synthesize new examples, doing a kind of "review paper" or "wiki" reporting. This can be scaled for billions of articles, making a 1000x larger AI wikipedia.

Interacting with search engines is just one way to create data with LLMs. Interacting with code execution and humans are two more ways. Just human-AI interaction alone generates over one billion sessions per month, where LLM outputs meet with implicit human feedback. Now that most organic sources of text have been used, the LLMs will learn from feedback, task outcomes and corpus study.

Yes, that's why there was no human in the loop and I was using LLMs as a proxy to bottom up approach in step 1. But the hallucinations can creep into the knowledge graph also as mentioned by another commentator
Yann LeCun said something to the effect you cannot get reasoning with fixed computation budgets, which I found to be a simple way to explain and understand a hypothesized limitation
Nothing prevents you from doing chain of thought style unbounded generation.
Logic has all its own problems. See "Godel, Escher, Bach" or ask why OWL has been around for 20 years and had almost no market share, or why people have tried every answer to managing asynchronous code other than RETE, why "complex event processing" is an obscure specialty and not a competitor to Celery and other task runners. Or for that matter why can't Drools give error messages that make any sense?
As a computational biologist, I've used ontologies quite a bit. They have utility, but there is a bit of an economic mismatch between their useful application and the energy required to curate them. You have some experience in this space. Do you think LLMs could speed up ontology / knowledge graph curation with expert review? Or, do you think structured knowledge has a fundamental problem limiting its use?
LLMs right now don't employ any logic. There can always be corners of "I don't know" or "I can't do that" - than the current system which is 100% confident in it's answer because it's not actually trying to match any constraint at all. So at some point the system will apply logic but may not be as formal as we do in pure math.
But the problem is with the new stuff it hasn't seen, and questions humans don't know the answers to. It feels like this whole hallucinations thing is just the halting problem with extra steps. Maybe we should ask ChatGPT whether P=NP :)
Haha, asking chat-gpt surely won't work. Everything can "feel" like a halting problem if you want perfect results with zero error with uncertain and ambiguous new data adding.

My take - Hallucinations can never be made to perfect zero but they can be reduced to a point where these systems in 99.99% will be hallucinating less than humans and more often than not their divergences will turn out to be creative thought experiments (which I term as healthy imagination). If it hallucinates less than a top human do - I say we win :)

Right - the word "hallucination" is used a lot like the word "weed" - it's a made-up thing I don't want, rather than a made-up thing I do want.
How is weed made up? Isn’t it just dried leaves from the cannabis plant?
OP mostly likely means "weed" like "pest" or "annoyance", i.e. a category of undesirable plants that tend to appear unbidden along with desirable plants. The distinction isn't biological, it's just that when you create a space for growing then things that grow won't all be what you want.

(The term "weed" for marijuana is just a joke derived from that sense of the word.)

Yeah but when you come to halting problems on that level of complexity multi-hierarchical-emergent phenomena occur aperiodically and chaotically that is to say in the frequency domain the aperiodicity is fractal like, discreet and mappable.
For the first step CYC[1] could be a valid solution. From my experience I whould call it a meaningful relation schema for DAGs. There is also an open source version available [2]. But it is no longer maintained by the company itself.

[1] https://cyc.com

[2] https://github.com/asanchez75/opencyc

Interesting. I haven't really seen much into this space. But anything which can provably represent concepts and relationships without losing information can work. Devil might be in details; nothing is as simple as it looks on first sight.
Formal verification of knowledge/logical relationships? how would you formally verify a sci-fi novel or a poem? What about the paradoxes that exist in nature, or contradicting theories that are logically correct? This is easier said than done. What you are proposing is essentially 'let's solve this NP-hard problem, that we don't know how to solve and then it will work'.
Oh, exactly. But let me know your thoughts on this - let's say if you have a graph which represents existing sci-fi novel = rather than the current model which is just blindly generating text on statistical probabilities would it not help to have to model output also try to fit into this rather imperfect sci-fi novel KG? If it doesn't fit logically. Based on how strong your logic requirements are system can be least creative to most creative etc.

I was not actually aware that building KG from text is NP-hard problem. I will check it out. I thought it was a time consuming problem when done manually without LLMs but didn't thought it was THAT hard. Hence I was trying to introduce LLM into the flow. Thanks, will read about all this more!

What is the formal logical system?

Eg, KGs (RDF, PGs, ...) are logical, but in automated construction, are not semantic in the sense of the ground domain of NLP, and in manual construction, tiny ontology. Conversely, fancy powerful logics like modal ones are even less semantic in NLP domains. Code is more expressive, but brings its own issues.

I had KGs in mind with automated construction which can improve and converge during training phase. I was hypothesizing that if we give incentive during training phase to also construct KGs and bootstrap the initial KGs from existing LLMs - the convergence towards a semantically correct KG extension during inference can be achieved. What do you think?
> bootstrap the initial KGs from existing LLMs

LLMs generate responses based on statistical probabilities derived from their training data. They do not inherently understand or store an "absolute source of truth." Thus, any KG bootstrapped from an LLM might inherit not only the model's insights but also its inaccuracies and biases (hallucinations). You need to understand that these hallucinations are not errors of logic but they are artifacts of the model's training on vast, diverse datasets and reflect the statistical patterns in that data.

Maybe you could build retrieval model but not generative model.

I thought addition of the "logical" constraints in the existing training loop using KGs and logical validation would help into reducing wrong semantic formation at the training loop itself. But your point is right that what if the whole knowledge graph is hallucinated during the training itself.

I don't have answer to that. I felt there would be lesser KG representations which would fit a logical world, than what fits into the current vast vector spaces of network's weight and biases. But that's just a idea. This whole thing stems from this internal intuition that language is secondary to my thought process and internally I feel I can just play around concepts without language - what kind of Large X models will meet that kind of capability I don't know!

You cannot manufacture new information out of the same data.

Why should you believe the output of the LLM just because it is formatted a certain way (i.e. "formal logical relationships")?