Hacker News new | ask | show | jobs
by voodooEntity 32 days ago
So i just finished a long running project (10 years ftw) just to deepdive into the next one.

I have no public sources yet (will come at some point) but ill try to break it down into some simple points. After all: this is a research project.

Project: DeepThought

So instead of going for the path to take bigger and bigger models to solve more complex questions, i going another direction. My idea is to use LLM's in a way like an "inner monologue" to replicate a thought chain. Basically create thinking steps that can be dynamically chained.

Additionally, the project contains a 3 layer memory system which is parted into:

1. Frontbrain (this data composes the context for inference, its a set of "hot nodes" which have a temprature that per turn of conversation will cool down a bit, while if they are used in a "thinking process" get warmed up a big again. The idea is to have the context for the inference to only get the currently relevant information, while dropping of things that lost relevance. This should prevent context overflow

2. STM : Basically a session memory. This will keep all information from the current session even if they got to cold and dropped out of Frontbrain

3. LTS : LTS is always query'able for the thought process to retrieve information/structures, but only at the sessions end information is propagated from the STM to LTS. This makes identification of "unique" entities alot easier and has some other advantages.

So when you type something into the DeepThought engine, it will extract all information from your input and convert it into a kinda 2 type structure 1. A bitemporal hypergraph composed of Entities and Hyperatoms. While entities i think are kinda easy to grasp, hyperatoms can either represent "properties" (in form of facts) or relations to other entities. This allows to create a graph structure typed information network containing the relevant information

2. Frame summaries. Since only having a structured graph as just described looses a lot of processual/logical information which are relevant especially in more complex contexts, i also create basically short summary texts that are linked to entities.

This structures allow me to use dynamic graph traversal for searching for data, while also retrieving the related Frame summaries that are a more native variant for an LLM to understand logics and relations.

This is a very very superifical explaination because to go into detail would take quite prolly multiplage pages of info.

Important: Im running this on a local 5090 and it is NOT friendly in terms of amount of inferences (which is fine for me). I try to mimic a thought process not build a fast shipping product. Quality > quantity. If you would run DeepThought on any online inference provider your broke in 1 day.

So, rn i focus on the ingestion and retrieval logics to make storing and retrieving as good as possible with my hw options.

While the ingestions already involves multiple steps in which the "llm" basically works as judge to decide where to traverse in the graph, where to go into recursion and similar, this will become very relevant as soon ill start implementing "task execution" as capability.

If i solved those the next point is to reduce everything that i need in terms of thinking steps in what i would call "thinking primitives". The idea with those is, that i dont want a hardcoded thinking process, but it rather also want to have the thinking process in form of a graph structure. This would allow me to compose the process in form of data in the hyepergraph, which would in return allow me to enable the system to refactor/enhance its own thought processes.

So ye thats what im working on rn, very early concept/alpha phase.