Hacker News new | ask | show | jobs
by jrockway 773 days ago
Remembering more than 4096 tokens is my superpower.
3 comments

How ‘bout remembering a million tokens? I’m not feeling too confident about that. Basically my only moat, if there is one, is that I’m able to rely on higher level cognition which LLMs don’t yet have, rather than just on associative memory alone.
Except these models with high token counts tend to forget the tokens at the beginning of the prompt anyways.
This is something that is true in 2024. I would not bet on it being true in 2030.
I might, actually. Think of where electric cars were six years ago — 2018. Not much has changed. Or, at least, there are still fundamental problems to be solved.

In the same way I can imagine that by 2030 LLMs will still have memory problems and hallucinations. Although I’m sure by then we’ll have something better than pure LLMs.

I've heard claims that context without forgetfulness has already been reached 2 months ago, but as I'm not a domain expert I don't trust that I can differentiate breakthroughs from marketing BS, and I definitely can't differentiate either of those from a Clever Hans: https://arstechnica.com/information-technology/2024/03/claud...
I work in this field, so here's a comment with higher signal-to-noise ratio than you'll commonly find on HN when it comes to LLMs: notice how the demo use cases for very long context stuff deal almost universally with point retrieval, and never demonstrate a high degree of in-context learning. That is not coincidental. The ability to retrieve stuff is pretty great and superhuman already. The ability to reason about it or combine it in nontrivial ways leaves a lot to be desired still - for that you have to train (or at least fine tune) the underlying model. Which IMO is great, because it neatly plugs the gaps in human capability.
but by then we'll have forgotten this conversation, so it nets out
So do I, and worse. Look, all I’m saying is I’m thankful for this crutch that helps me deal with the limitations of my associative memory, so as long as it can’t think and can’t replace me entirely
bacopa and lions mane was night and day for my limited memory. But, the obvious, writing simpler code, keeping a notebook while working, etc, and really spending time breaking down problems into much smaller scopes, while simultaneously keeping copious notes on where I am in a given process helped immensely with dealing with peanut brained memory. sure, it's not quick, but my work is usually very readable and understandable for the future reader. I'm not convinced that a tool to help me overcome that memory barrier would actually help me write better code, maybe just write worse code faster. Of course, that's probably the corporate goal though.
Lions mane didn't seem to do much for me. My memory is actually not that bad, though certainly I have seen people with _much_ better memory than mine. I still maintain contact with some of them I met in college, and then later in the various companies. It's just that I deal with so much information and so many streams of it, even writing it down would be a massive chore.

I would pay for a pre-packaged system which could _locally_ and _privately_ make sense of all the emails, PDFs, slack messages, web pages I saw, other documents shared with me, code, etc etc, and make it all easily queryable with natural language, with references back to the sources. Sooner or later someone will make something like that.

I may remember more than 4096 tokens, but I probably only pay attention to 7 of them at any given moment: https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus...
The context heads of a LLM are more analogous to the sort of processing that goes on in, e.g., Brocca's Area of your brain as opposed to working memory. You can't have anything analogous to working memory as long as LLMs are operating on a strict feed forward basis[1]. And the fact that LLMs can talk so fluently without anything like a human working memory (yet) is a bit terrifying.

[1] Technically LLMs do have a forget that last toke and go back so I can try again operation so this is only 99% true.

No, you don’t. Seven, plus or minus two at best.
I think humans have better general recall whilst lacking any kind of precision. After reading an entire book, I definitely can’t replicate much (if any) of the precise wording of it, but given a reasonably improbable sentence I can probably tell with certainty that it didn’t appear. LLMs are probably much more prone to believing they’ve read things that aren’t there and don’t even pass a basic sanity check, no matter how long the context window.