|
|
|
|
|
by _heimdall
468 days ago
|
|
This paper [1] may be an interesting place to start. We only know how the structures are designed to work, and we have hypothesise of how they likely work. We can't interpret what actually happens when the LLM is actually going through the process of generating a response. That seems pedantic or unimportant on the surface, but there are some really important implications. At the more benign level, we don't know why a model gave a bad response when a person wasn't happy with the output. On the more important end, any concerns related to the risk of these models becoming self-directed or malicious simply can't be recognized or guarded against. We won't know if a model becomes self-directed until after it acts on it in ways that don't match how we already expect them to work. Both alignment and interoperability were important research topics for decades of AI research. We effectively abandoned those topics once we made real technological advancement - once an AI-like tool was no longer entirely theoretical we couldn't be bothered focusing resources on figuring out how to do it safely. The horse was already out of the barn. Does this mean they will turn evil or end up going poorly for us? Absolutely not. It just means that we have to cross our fingers and hope because we can't detect issues early. [1] https://arxiv.org/abs/2309.01029 |
|
There are 2 things we’re talking about here.
There’s the physical, mechanical operations going on during inference and there’s potentially a higher order process happening as an emergent property of those mechanical operations.
We know precisely the mechanical operations that take place during inference as they are machine instructions which are both man-made and very well understood. I hope we can agree here.
Then there’s potentially a higher order process. The existence of that process and what that process is still a mystery.
We do not know how the human brain works, physically. We can’t inspect discrete units of brain operations as we can with machine instructions.
For that reason, it is uncritical to assume that there is any kind of “thought” process occurring at inference which is similar to our thought processes.
Comparing the two is like apples and oranges anyway and is pedantic in a non-useful way, especially with our limited understanding of the human brain.