| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jrockway 773 days ago
	Remembering more than 4096 tokens is my superpower.

3 comments

ein0p 773 days ago

How ‘bout remembering a million tokens? I’m not feeling too confident about that. Basically my only moat, if there is one, is that I’m able to rely on higher level cognition which LLMs don’t yet have, rather than just on associative memory alone.

link

iwontberude 773 days ago

Except these models with high token counts tend to forget the tokens at the beginning of the prompt anyways.

link

margalabargala 773 days ago

This is something that is true in 2024. I would not bet on it being true in 2030.

link

xanderlewis 773 days ago

I might, actually. Think of where electric cars were six years ago — 2018. Not much has changed. Or, at least, there are still fundamental problems to be solved.

In the same way I can imagine that by 2030 LLMs will still have memory problems and hallucinations. Although I’m sure by then we’ll have something better than pure LLMs.

link

ben_w 773 days ago

I've heard claims that context without forgetfulness has already been reached 2 months ago, but as I'm not a domain expert I don't trust that I can differentiate breakthroughs from marketing BS, and I definitely can't differentiate either of those from a Clever Hans: https://arstechnica.com/information-technology/2024/03/claud...

link

ein0p 772 days ago

I work in this field, so here's a comment with higher signal-to-noise ratio than you'll commonly find on HN when it comes to LLMs: notice how the demo use cases for very long context stuff deal almost universally with point retrieval, and never demonstrate a high degree of in-context learning. That is not coincidental. The ability to retrieve stuff is pretty great and superhuman already. The ability to reason about it or combine it in nontrivial ways leaves a lot to be desired still - for that you have to train (or at least fine tune) the underlying model. Which IMO is great, because it neatly plugs the gaps in human capability.

link

riwsky 773 days ago

but by then we'll have forgotten this conversation, so it nets out

link

ein0p 773 days ago

So do I, and worse. Look, all I’m saying is I’m thankful for this crutch that helps me deal with the limitations of my associative memory, so as long as it can’t think and can’t replace me entirely

link

yonaguska 772 days ago

bacopa and lions mane was night and day for my limited memory. But, the obvious, writing simpler code, keeping a notebook while working, etc, and really spending time breaking down problems into much smaller scopes, while simultaneously keeping copious notes on where I am in a given process helped immensely with dealing with peanut brained memory. sure, it's not quick, but my work is usually very readable and understandable for the future reader. I'm not convinced that a tool to help me overcome that memory barrier would actually help me write better code, maybe just write worse code faster. Of course, that's probably the corporate goal though.

link

ein0p 772 days ago

Lions mane didn't seem to do much for me. My memory is actually not that bad, though certainly I have seen people with _much_ better memory than mine. I still maintain contact with some of them I met in college, and then later in the various companies. It's just that I deal with so much information and so many streams of it, even writing it down would be a massive chore.

I would pay for a pre-packaged system which could _locally_ and _privately_ make sense of all the emails, PDFs, slack messages, web pages I saw, other documents shared with me, code, etc etc, and make it all easily queryable with natural language, with references back to the sources. Sooner or later someone will make something like that.

link

ben_w 773 days ago

I may remember more than 4096 tokens, but I probably only pay attention to 7 of them at any given moment: https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus...

link

Symmetry 772 days ago

The context heads of a LLM are more analogous to the sort of processing that goes on in, e.g., Brocca's Area of your brain as opposed to working memory. You can't have anything analogous to working memory as long as LLMs are operating on a strict feed forward basis[1]. And the fact that LLMs can talk so fluently without anything like a human working memory (yet) is a bit terrifying.

[1] Technically LLMs do have a forget that last toke and go back so I can try again operation so this is only 99% true.

link

pwdisswordfishc 773 days ago

No, you don’t. Seven, plus or minus two at best.

link

xanderlewis 773 days ago

I think humans have better general recall whilst lacking any kind of precision. After reading an entire book, I definitely can’t replicate much (if any) of the precise wording of it, but given a reasonably improbable sentence I can probably tell with certainty that it didn’t appear. LLMs are probably much more prone to believing they’ve read things that aren’t there and don’t even pass a basic sanity check, no matter how long the context window.

link