| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by aghilmort 387 days ago

most mainstream models are decoders vs. encoders-decoders, diffusers, etc. and lack reversible causal reasoning, which of course can be counter-intuitive since it doesn’t feel that way when models can regenerate prior content

some hacks for time / position/ space flipping the models:

- test spate of diffusion models emerging. pro is faster, con is smaller context, ymmv is if trained on that language &/or context large enough to ICL lang booster info

- exploit known LTL tricks that may work there’s bunch of these

- e.g., tell model to gen drafts in some sort RPN variant of lang, if tests tell it to simulate creating such a fork of this and then gen clean standard form at end

- have it be explicit about leapfrogging recall and reasoning, eg be excessively verbose with comments can regex strip later

- have it build a stack / combo of the RPN & COT & bootstrapping its own ICL

- exploit causal markers - think tags that can splinter time - this can really boost any of the above methods - eg give each instance of things disjoint time tags, A1 vs K37 for numbered instances of things that share a given space - like a time GUID

- use orthogonal groups of such tags to splinter time and space recall and reasoning in model, to include seemingly naive things like pass 1 etc

- our recent arXiv paper on HDRAM / hypertokens pushes causal markers to classic-quantum holographic extreme and was built for this, next version will be more accessible

- the motivators are simple - models fork on prefix-free modulo embedding noise, so the more you make prefix-free, the better the performance, there’s some massive caveats on how to do this perfectly which is exactly our precise work - think 2x to 10x gain on model and similar on reasoning, again ymmv as we update preprint, post second paper that makes baseline better, prep git release etc to make it tons easier to get better recall and exploit same to get better reasoning by making it possible for any model to do the equivalent of arbitrary RPN

- our future state is exactly this a prompt compiler for exactly this use case - explainable time-independent computation in any model