Hacker News new | ask | show | jobs
by qrybam 1185 days ago
A couple of years ago I was on a walk with my wife and she asked me “what’s the point of Siri? it can hardly do anything useful outside weather and timers.” I told her a tall tale of how these things are essentially going to become our personal assistants in the future, they’ll know about our unique situations and can act accordingly. Just like the rich and powerful have people who specialise in organising their lives so they get more done in less time, so AI through natural interfaces will all of us many of those same benefits.

Progress… This is the first time I see a product that fits that description.

4 comments

You're assuming that the AI assistant will have your own interests as its priority, and not the interests of some other party. If this 'tool' is being supplied by a government or corporation, then it could be used to create a very static hierarchy - imagine some incompetent upper-level bureaucrat using it to discover and sabotage any competent lower-level employees who might eventually present threats to their own position? Germany's STASI would have also used it as a mass-surveillance tool, and China today would use it to generate individual social credit scores.

It does have great promise in an open-source self-hosted incarnation not controlled by external actors, however.

> It does have great promise in an open-source self-hosted incarnation not controlled by external actors, however.

I'm not even sure about that, entirely. My very limited understanding of this is that a core requirement is the initial data - the large language models(?). Which of these you can use, or how it's initially developed/populated, will have an influence on the answers you get and how it may evolve/"learn".

Instead of trusting the external corp to run the service, you need to trust whatever actors are building the base data sets, and be concerned what sort of bias may be inherent.

Or do I have this totally wrong?

I think for now, the data requirements to train a SOTA LLM are so extreme we don’t have the luxury of being picky with the training data. We are getting close to the point where there isn’t enough human written text in existence to continue scaling these models.

Model refinement seemingly has lower training requirements, putting it within the reach of smaller organizations or wealthy individuals. If you don’t like the refinement dataset it will likely be feasible to bootstrap your own off someone else’s LLM. See what Stanford did with Alpaca.

I'm waiting for a general correction mechanism, I don't even know what to call it. "NO, chatgpt, people usually have 5 fingers", and the gpt just learns, rather like a child. I keep thinking that's the next real step.
The problem is that, to the extent the analogy of ChatGPT to a living thing makes sense, the individual isn’t the model (that's just the common species-defining—or maybe “clone family” is better than “species”—set of instincts), the individual lifespan is the conversation.

You could share feedback across conversations by allocating prompt space to it, at the expense of limiting the size of the conversation, but you'd need a way to decide what to share this way.

You could also take the conversation and use it as part of the reinforcement learning dataset. I feel like that's the closest thing to long term memory ChatGPT is capable of right now.
I think what's mainly stopping that from happening is that GPT-4 doesn't remember older chats. If we make it remember everything ,it should get more personal and remember everything right?
The token limit is the problem, in general token limits can’t be changed after the model has been trained. Gpt4 has an exceptionally large 32k token limit, but even with 32k tokens you’d only get a few weeks of chat before the context window was full.

Not to mention the added cost of using the full 32k tokens. OpenAI is charging $0.12 a token which would quickly add up. It’s prohibitively expensive unless you have a very very compelling business use case.

>We are getting close to the point where there isn’t enough human written text in existence to continue scaling these models.

People say this, but GPT-3 (the latest we know the details on) was 45TB of text, which may be most of the open Internet, but still lacks non-publicly-indexed Internet text (i.e. things behind paywalls, things behind log-in screens like emails), any book outside of Bibliotik's 200k books (remember when Google was randomly digitizing all books it could get its hands on?), and plenty of other non-digitized text.

OpenAI wants you to believe that we are running out of text, but even at Google, there's 100's of TB of text that OpenAI doesn't have access to (Google Books, Google Docs, Gmail, Search Queries, Archived pages beyond what CommonCrawl gets, Paywalled news articles that allow Google to crawl them, etc.).

Now the key question that GPT-4 will hopefully answer is "are bigger datasets really the key, or are larger context windows?"

If you're thinking of investing in/working for OpenAI, you better hope the answer is context windows.

that's why I'm working on my own assistant, with a fine-tuned model which actually learns and memorizes stuff about the user :)
Part of the problem is... people are not good at stating their problem with words. A lot of the time they have a vague idea of disconnected parts. By the time they are able to write down the problem decently, it is already half solved
Well, your gf may ask you again about the usefulness of the tools that are being released now in 5-10 years from now...
You were really playing the long game with that one.