| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by smallest-number 1161 days ago
	Projects like this try to get around the fundamental flaw in GPTs - namely that they do not have goals, plans, thought processes etc - without actually solving it, e.g. by having the AI write out its "goals" before continuing. But this is a hacky fix, and will never be reliable enough for consistent use. For that, more actual research is necessary, on how to simulate and model goals and trains of thought and have them interface with the world model provided by an LLM.

3 comments

ianbicking 1161 days ago

I feel like there's an implication here that the research should be in modeling architectures and training sets and other specialized machine learning. But there is research here: in natural language modeling of goals, plans, thoughts, processes, etc.

Obviously we don't know what paths will be most successful. But a path where critical drivers of AI (like goals) are modeled in a transparent and comprehensible manner seems like a very attractive direction to take. I'd much rather be able to read my AI agents goals, plans, intermediate goals, self-analysis, etc., than have it all captured in a set of completely incomprehensible weights.

link

wahnfrieden 1161 days ago

The AI would never say hello, but if you say hello to it, it will say hello back. Is that also a hack? Aren’t you just describing everything about LLM behavior generally not only something specific about goals/tasks? In that case the nature of the thing is less interesting than the results we're able to find from it and I wouldn't worry about this kind of purity test.

link

pixl97 1161 days ago

I mean most people don't have the resources needed to build a model big enough that these types of behaviors emerge so third party addons is all we got until Google/Microsoft/OAI drop something on us.

Part of the issue here is the massive amount of compute needed over what we're already spending. ToT is showing a likely 10 to 20x number of calls to get an answer, which when you are compute limited is going to be a problem for deployment in mass. It's very likely we're going to have to wait for more/faster hardware.

link