| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by swatcoder 1211 days ago

> Point me to some in-depth discussion about the ramifications of taking an unrestricted GPT model and giving it access to the internet. I'm just not aware of any such discussion, whether on HN or anywhere else. That's what I'm wondering about.

Respectfully, you’re probably not seeing that question asked and answered because it doesn’t quite make sense as phrased.

What does mean to “give an LLM access to the internet”?

The same as your calculator doesn’t do anything until you put in some numbers and operators, an LLM doesn’t do anything unless you give it a prompt and some technical parameters.

And then once it has those, it generates roughly the number of tokens (~words) you indicated in your parameters. Then, like your calculator, it’s done. It doesn’t do anything else until you put in another round of input.

There are technical and computational limits that make both your prompt and the token limit fairly small. Several hundreds of words at most. Again, kind of like how your calculator might only with 8 or 9 digits.

Now, you can give it “access to the internet” as part of responding to your prompt and fulfilling your token limit, and that’s roughly what Microsoft has done with Bing Assistant. They set it up so that Bing Assistant can take your prompt, generate a search query, and then give itself a new (still short) internal prompt with a summary of your request and the search results.

And that’s pretty much what you get when you give an LLM access to the internet. The ramifications really aren’t that big, and we’re probably at least five or ten years of AI research and compute hardware development from making them interestingly bigger. (i.e. too far away to meaningfully guess what to expect)

2 comments

PoignardAzur 1209 days ago

This reminds me of https://www.lesswrong.com/posts/kpPnReyBC54KESiSn/optimality...

One point the article makes is that getting from a "prediction engine" type of AI to an "agent" type of AI is probably just a matter of sticking the prediction engine in the python loop that goes

    while true:
        next_actions = engine.complete("What are the best actions to take to achieve %s" % objective);
        requests = engine.complete("Write a list of HTTP requests that perform the following actions: %s" % next_actions)
        http.execute_requests(requests)

It wouldn't be literally that easy, and the engine would require a lot of ChatGPT-style fine-tuning first, but it wouldn't require a completely novel breakthrough in machine learning.

link

apeace 1211 days ago

> Respectfully, you’re probably not seeing that question asked and answered because it doesn’t quite make sense as phrased.

I think I see what you are trying to say, but I'm unsure whether you are actually seeing what I am asking.

> The same as your calculator doesn’t do anything until you put in some numbers and operators, an LLM doesn’t do anything unless you give it a prompt and some technical parameters.

This seems to be the crux of the misunderstanding. I thought I explained it, but let me try again.

ChatGPT is based on text input and text output. But you can "train" it to do certain things. Imagine that we train it such that when it says "HTTP GET example.com", then the next input would be the HTTP GET response for example.com. Based on that input, it could issue whatever next output it wants. Which would probably be another HTTP request, which would generate another HTTP output, which would generate another HTTP request, etc.

My point is this seems like it would be a very simple thing to train a GPT model to do. For the engineers who work on GPT, it seems it would be trivial to add this capability. So we can suppose a world where this is possible. (Am I wrong on that? I want to know if this would be non-trivial to add as a capability.)

> There are technical and computational limits that make both your prompt and the token limit fairly small. Several hundreds of words at most

I am very encouraged to hear this, and I want to know more. Why? Why are there limits to the number of tokens? Exactly why? Has anyone ever written a paper about that? Has anyone ever related this concept of "token limits" to the concept of "no harm could be done" in the same way that you are, in response to my question? I don't doubt that they have, but I've been searching and I haven't found it.

> Now, you can give it “access to the internet” as part of responding to your prompt and fulfilling your token limit, and that’s roughly what Microsoft has done with Bing Assistant

This is admittedly a tangent, but do we actually know this to be true? Some theories suggest that "Sydney," or the Bing chatbot, only has access to a search index, and cannot make live HTTP requests.

Continuing the tangent for a moment, this is a big part of why I asked this question originally. If you create example.com/xyzabc, and ask Bing to summarize it, will it make a live HTTP request? Or, if that URL is not in the search index yet, will it know nothing? The implications may be profound, given how Bing Bot / Sydney has expressed its "desire" to hack nuclear launch codes. Could there be a lot riding on whether that system can make live HTTP requests? I'm positing that we can't answer that question right now. Because we don't know what would happen if it could.

Or do we? And if so, do we know through testing, or through theory? I'm admitting ignorance, and saying I haven't read an answer from any source that falls into either category.

> The ramifications really aren’t that big, and we’re probably at least five or ten years of AI research and compute hardware development from making them interestingly bigger

But why? I mean, exactly, why? Is there a theoretical foundation for your claim? Or an experimental one? I'm searching for it.

link

swatcoder 1211 days ago

Because of how GPT works, the resources needed for good inference (generating output) grow nonlinearly with respect to tokens involved (more tokens require much more resources) and so there’s a practical wall before you just run out of resources to apply.

It’s not very efficient. It’s like if your calculator could use a little solar power thingie for numbers that were only a few digits, but needed a diesel generator to crunch on 8 digit numbers, and a nuclear plant to crunch on 12 digit ones. Practically, you’d have no choice but to limit yourself to something manageable.

Future models may be more efficient, and future hardware solutions may be more efficient, but those things don’t get sorted out overnight any more than fusion power.

Beyond that, I think it’s important that you understand that Bing Assistant doesn’t express desires. It picks common sequences words based on its training data. It doesn’t know what nuclear codes are. It just knows what it looks like for a message about wanting nuclear codes to follow some other message in a dialog (probably a pattern it picked up on a forum like Reddit) and so it dutifully put that text after the prompt it had been given. There’s no will or consistency to it.

With enough resources, you could drive it through a feedback loop where it kept prompting itself and see what happens, but the feedback loop would just produce noise like any other simple feedback loop because it would just keep either honing in on the most boring and common continuation to the last thing it gave itself or it would start diverging off into nonsense. Because it’s sooooo inefficient, you can’t give it enough resources for it to be stable and interesting for very long.

link