| HN Mirror

Haha, blursed one might say. In seriousness though, the social avoidance of wanting to talk to a computer around others will likely be the largest bottleneck to adoption for this sort of tech. May need to initially frame it as for work from home engineers.

Luckily the other side to this project doesn't require any user behavioural changes. The idea is to convert chat histories into a tree format with the same core algorithm, and then send only the relevant sub-tree to the LLM, reducing input tokens and context bloat, thereby also improving accuracy. This would then also unlock almost infinite length LLM chats. I have been running this LLM context retrieval algo against a few benchmarks, GSM-infinite, nolima, and longbench-v2 benchmarks, the early results are very promising, ~60-90% reduced tokens and increased accuracy against SOTA, however only on a subset of the full benchmark datasets.